Earlier this year, Facebook launched Graph Search in a limited release to help users search online activities. Subsequently, Facebook engineers were forced to adjust the algorithms of this search engine so that its machines could recognize slang and colloquial language, such as interpreting "my buddy's photos" as "my friend's photos".
After the algorithm was changed, the search performance of Graph Search improved somewhat, but this is just the first step in a long journey. Just like tech giants Google and Apple, Facebook has also started enabling its devices to undergo "Deep Learning." Deep Learning is an emerging field in computer science; simply put, it involves building computer programs that process data like a human brain to achieve the goal of computers understanding human language and behavior. A spokesperson for Facebook said that the company has only just begun research in this area.
In English, commonly used words like "off," "the," and "hook" have many meanings, and their interpretations can vary depending on context within different phrases. For example, "cool" sometimes means "great," but Facebook's machines may not know what "cool" means because these machines haven't been "educated."
For Graph Search, these subtle interpretations seem unimportant because it was initially only designed to help users clarify relationships between themselves and their friends. Now, Graph Search can extract and analyze posts and comments on Facebook. In other words, all the information you write on Facebook can be searched, including everything written in the status bar at the top of your timeline and news feed.
"Because of differences in cultural upbringing, people use language in very different ways. We need to teach machines to recognize these nuances," said Oleg Rogynskyy, CEO of text analysis company Semantria. "Currently, computers cannot accurately understand these things because they lack cultural context. This will be one of the hardest challenges to overcome in the next 10 to 15 years."
To make computers more like human brains, computer scientists from companies like Baidu, Google, Microsoft, and IBM have started researching "Deep Learning." This fall, Facebook joined their ranks and launched its own "Deep Learning" research group.
"Deep Learning" includes building neural networks, i.e., multi-layered software processing systems — inspired by the structure of the human brain. Like the human brain, these artificially constructed neural networks can collect information and respond. They can understand what an object says or does, as well as the meaning of sentences, without going through the traditional machine learning process.
Deep Learning can solve many complex problems, such as computer vision, speech recognition, language translation, and natural language processing, but achieving these requires massive amounts of data. Richard Socher, a computer scientist studying natural language processing at Stanford University, stated: "Deep learning requires less ergonomics engineering, and its capabilities depend on increasing amounts of training data. The more data, the better it can infer correctly."
Companies like Baidu, Google, and Microsoft have already used deep learning algorithms to enhance image and voice search effectiveness. The next big challenge is using deep learning algorithms to decipher personal thoughts and ideas. This challenge will keep these companies busy for a long time — consider how many emotions are represented by the endless stream of information on your Facebook and Twitter pages.
The kind of computer brain mentioned by Rogynskyy can understand dialectical differences in different linguistic contexts. The first step to achieving this understanding capability is building algorithms that can comprehend people's opinions and feelings; then building algorithms that can analyze human emotions, i.e., understanding multidimensional emotions, such as understanding both the good and bad aspects of something. Recently, Stanford University computer scientist Socher introduced a deep learning algorithm capable of achieving this, and it better understands written language than other methods. Currently, several startups have approached him regarding his new algorithm.
Today, even the smartest algorithms cannot accurately judge personal thoughts based on a string of words. This is because the most widely used sentiment analysis model has been limited to the so-called "bag-of-words" method — inferring based on word order. This system sees only a jumble of mixed vocabulary, looks up the meanings of the words, and finally deduces whether the sentence or paragraph has positive or negative connotations. Other similar algorithms can analyze sentences of varying lengths, and their calculated results may be closer to the actual meaning. The latter algorithm is better, but only slightly so.
If you're interested in listening to public voices, these algorithms can already handle the task. However, what tech companies truly want is to understand individual users and provide them with personalized information and advertisements. Elliot Turner, CEO of Deep Learning and sentiment analysis company Alchemy API, pointed out: "If the system's error rate in analysis is 30%, you probably wouldn't trust its interpretation of a tweet again."
This is why Facebook and other companies are researching Deep Learning. They want a technology that can better understand individual users, such as their feelings, interaction preferences and methods, and everything else. They can use this information to improve user experience, build brand loyalty, and ultimately sell products to these users — something current technology struggles to achieve. Turner said: "The power of deep learning lies in constructing high-level abstract data representations, from letters to word phrases to sentence fragments to paragraphs."
As the internet becomes more widespread, the database of information online grows richer, such as the Internet Movie Database, Wikipedia, Data.gov, PubMed literature service retrieval system, CIA Factbook — all this data can be used as training data in deep learning models. Some of this data is publicly available, giving Facebook more data to work with, and also providing opportunities for companies without large databases to enter the field of deep learning.