This action might not be possible to undo. Are you sure you want to continue?
Chandan Prasad Gupta
MS by Research Department of Computer Science and Engineering Kathmandu University Dhulikhel, Nepal
Sentiment without action is the ruin of the soul. — Edward Abbey It would be not fair to say that not enough development has been done in the filed of Sentiment Analysis and its development. Talking about applications developed in this field, application for specific purpose targetting particular user group and/or product are plenty in number with adequate level of expertise and success. But number of genre of applications are very few. Applications in the field of social media are well matured while applications made up for political analysis puposes have apexed lately. It is important to mention that because of all the possible applications, there are a good number of companies, large and small, that have opinion mining and sentiment analysis as part of their mission. However, I have chosen not to mention these companies individually, due to the fact that the industrial landscape tends to change quite rapidly, so that lists of companies risk falling out of date rather quickly. In additon it should not sound like the project or the writing has been done to endorse the companies of our concern.
APPLICATIONS FOR SOCIAL MEDIA Facebook: Sentiment detection has been applied to a variety of different media, typically to reviews of products or services, though it is not limited to these. Boiy and Moens , for example, see sentiment detection as a classification problem and apply different feature selections to multilingual collections of digital content including blog entries, reviews and forum postings. Conclusive measures of bias in such content have been elusive, but progress towards obtaining reliable measures of sentiment in text has been made – mapping onto a linear scale related to positive versus negative, emotional versus neutral language, etc. Besides the ambiguity of human language, another bottleneck for sentiment detection methods is the time-consuming creation of sentiment dictionaries. One solution to this is a crowdsourcing technique to create such dictionaries with minimal effort, such as the Sentiment Quiz Facebook application. Twitter: However, sentiment dictionaries alone are not enough, and there are major problems in applying such techniques to microposts such as tweets, which typically do not contain much contextual information and which assume much implicit knowledge. They are also less grammatical than longer posts and make frequent use of emoticons and hashtags, which can form an important part of the meaning. This means that typical NLP solutions such as full - or even shallow -parsing are unlikely to work well, and new solutions need to be incorporated for handling extra-linguistic information. Typically, they also contain extensive use of irony and sarcasm, which are also difficult for a machine to detect. There exists a plethora of tools for performing sentiment analysis of tweets, though most work best on mentions of product brands, where people are clearly expressing opinions about the product. Generally, the user enters a search term and gets back all the positive and negative (and sometimes neutral) tweets that contain the term, along with some graphics such as pie charts or graphs. Typical basic tools are Twitter Sentiment , Twends and Twitrratr[4 ] . Slightly more sophisticated tools such as SocialMention allow search in a variety of social networks and produce other statistics such as percentages of Strength, Passion and Reach, while others allow the user to correct erroneous analyses. While these tools are simple to use and often provide an attractive display, their analysis is very rudimentary, performance is low, and they do not identify the opinion holder or the topic of the opinion, assuming (often wrongly) that the opinion is related to the keyword. 1 2 3 4 5 6 7 http://apps.facebook.com/sentiment-quiz http://twittersentiment.appspot.com/ http://twendz.waggeneredstrom.com/ http://twitrratr.com/ http://socialmention.com/ http://dev.twitter.com/pages/streaming api http://json-lib.sourceforge.net/
Algorithm, Technology, Tools and Resourcec behind these applications: These applications consists of a number of processing modules combined to form an application pipeline. They have used a number of linguistic pre-processing components such as tokenisation, partof-speech tagging, morphological analysis, sentence splitting, and so on. Full parsing is not used because of the nature of the tweets, it is very unlikely that the quality would be sufficiently high. They have applied ANNIE , the default named entity The named entities are then used in the next stages: first for the identification of opinion holders and targets (i.e., people, political parties, etc.), and second, as contextual information for aiding opinion mining. The main body of the opinion mining application involves a set of JAPE grammars which create annotations on segments of text. JAPE is a Java-based pattern matching language used in GATE . The grammar rules create a number of temporary annotations which are later combined with existing annotations and converted into final annotations. In addition to the grammars, we use a set of gazetteer lists containing useful clues and context words: for example, they have developed a gazetteer of affect/emotion words from WordNet. These have a feature denoting their part of speech, and information about the original WordNet synset to which they belong. The lists have been modified and extended manually to improve their quality: some words and lists have been deleted while others have been added. As mentioned above, the application aims to find for each relevant tweet, triples denoting three kinds of entity: Person, Opinion and Political Party. The application creates a number of different annotations on the text which are then combined to form these triples. The detection of the actual opinion (sentiment) is performed via a number of different phases: detecting positive, negative and neutral words (Affect annotations), identifying factual or opinionated versus questions or doubtful statements, identifying negatives, and detecting extra-linguistic clues such as smileys. Because there is only need of processing of the actual text of the tweet, and not the metadata, application uses a special processing resource (the Segment Processing PR) to run our application over just the text covered by the XML “text” tag in the tweet.
APPLICATIONS FOR CUSTOMER RELATIONSHIP MANAGEMENT
Opinion Mining literature has been largely driven by applicative interest in domains such as mining online corpora for opinions, or customer relationship management. What do people think about the latest camera-equipped cellular phone? What is the general opinion on the just-passed governmental decree on safety on the workplace? Is popular support for presidential candidate X’s promise of a tax cut growing? Systems capable of automatically detecting and tracking the evolution of customers’ opinions concerning a given product, of voters’ thoughts on a political candidate, of citizens’ opinions on governmental policies, are thus of enormous help to marketers, social scientists, information analysts, policy enforcers, and opinion makers, since they enable them to examine (and draw statistical information from) quantities of textual data beyond the reach of manual approaches. One of the earlier Opinion Mining works is the one from Das and Chen , where the global trend of orientation of opinions toward a certain company, posted in online forums, is compared to its stock price in the same time interval, showing a relevant correlation between the two measures. Turney  has worked on classifying the orientation of product reviews as either “thumbs up” or “thumbs down”, by using a measure of semantic association between the content of documents and two small sets of terms which are deemed to be representative of the two categories under examination. Similarly, Pang et al.  have investigated the use of a standard supervised text classification approach to classify movie reviews by their orientation. On the same task, Pang and Lee  have used a subjectivity classifier of sentences to build documents summaries which contain only the opinionated content. They have used it as a preprocessing com-ponent to the movie review classifier, improving its effectiveness. Pang and Lee  have also proposed an extension of the movie review classification task into an ordinal regression task, i.e. the assignment of movie reviews to a rating scale, ranging from no star (totally negative review) to five stars (totally positive review). On the movie review classification task, Whitelaw et al.  have shown a significant accuracy improvement produced by using a lexical resource where subjective terms are classified by their attitude type (e.g. distinction between aesthetic, affective, and moral evaluations). Attardi and Simi  have used information about which terms express subjectivity in a text search engine. Their system is able, for example, to handle the query [‘‘Barack Obama’’ near Subjective ] which retrieves all the document where the expression “Barack Obama” appears near (within a maximum distance of a given number of words) any subjective term. They have used such system in the 2006 TREC Blog track  showing a relevant improvement in the precision of retrieval of opinionated content derived from the use of such kind of queries. Yu and Hatzivassiloglou  describe a system which classifies entire documents, and then each of their sentences, as subjective or not. Its intended use is as the first data processing block in an opinion question answering system. Such system is designed to answer questions like “What are the causes of global warming?” where the answer has to take into account the multiple perspectives, and opinions, on the topic. This is a different, and harder, task with respect to traditional question answering in which answers are typically factual and univocal.
Wiebe et al. [95, 97] have investigated the opinion extraction problem, i.e. the task of detecting, within a sentence or document, the exact expressions denoting the statement of an opinion, and detecting therein the sub-expressions denoting the key components and properties (e.g. the opinion holder, the object of the opinion, the type of opin-ion, the strength of the opinion, etc.) within this statement. From these initial works, many other researchers have then worked on the task [11, 18, 17, 55, 56].
APPLICATION TO REVIEW RELATED WEBSITES
Clearly, the same capabilities that a review-oriented search engine would have could also serve very well as the basis for the creation and automated upkeep of review- and opinionaggregation websites. That is, as an alternative to sites like Epinions ( www.epinions.com ) that solicit feedback and reviews, one could imagine sites that proactively gather such information. Topics need not be restricted to product reviews, but could include opinions about candidates running for office, political issues, and so forth. There are also applications of the technologies we discuss to more traditional reviewsolicitation sites, as well.
APPLICATIONS AS A TECNOLOGY-COMPONENT
Detection of “flames” (overly-heated or antagonistic language) in email or other types of communication  is another possible use of subjectivity detection and classification. In online systems that display ads as sidebars, it is helpful to detect webpages that contain sensitive content inappropriate for ads placement ; for more sophisticated systems, it could be useful to bring up product ads when relevant positive sentiments are detected, and perhaps more importantly, nix the ads when relevant negative statements are discovered. It has also been argued that information extraction can be improved by discarding information found in subjective sentences . Question answering is another area where sentiment analysis can prove useful [189, 275, 285]. For example, opinion-oriented questions may require different treatment. Alternatively, Lita et al.  suggest that for definitional questions, providing an answer that includes more information about how an entity is viewed may better inform the user. Summarization may also benefit from accounting for multiple viewpoints . Additionally, there are potentially relations to citation analysis, where, for example, one might wish to determine whether an author is citing a piece of work as supporting evidence or as research that he or she dismisses . Similarly, one effort seeks to use semantic orientation to track literary reputation . In general, the computational treatment of affect has been motivated in part by the desire to improve human-computer interaction [188, 192, 296].
APPLICATIONS FOR GOVERNMENT INTELLLIGENCE
The field of opinion mining and sentiment analysis is well-suited to various types of intelligence applications. Indeed, business intelligence seems to be one of the main factors behind corporate interest in the field. Consider, for instance, the following scenario (the text of which also appears in Lee ). A major computer manufacturer, disappointed with unexpectedly low sales, finds itself confronted with the question: “Why aren’t consumers buying our laptop?” While concrete data such as the laptop’s weight or the price of a competitor’s model are obviously relevant, answering this question requires focusing more on people’s Personal views of such objective characteristics. Moreover, subjective judgments regarding intangible qualities — e.g., “the design is tacky” or “customer service was condescending” — or even misperceptions — e.g., “updated device drivers aren’t available” when such device drivers do in fact exist — must be taken into account as well. Sentiment-analysis technologies for extracting opinions from unstructured human-authored documents would be excellent tools for handling many business-intelligence tasks related to the one just described. Continuing with our example scenario: it would be difficult to try to directly survey laptop purchasers who haven’t bought the company’s product. Rather, we could employ a system that (a) finds reviews or other expressions of opinion on the Web — newsgroups, individual blogs, and aggregation sites such as Epinions are likely to be productive sources — and then (b) creates condensed versions of individual reviews or a digest of overall consensus points. This would save an analyst from having to read potentially dozens or even hundreds of versions of the same complaints. Note that Internet sources can vary wildly in form, tenor, and even grammaticality; this fact underscores the need for robust techniques even when only one language (e.g., English) is considered. Besides reputation management and public relations, one might perhaps hope that by tracking public viewpoints, one could perform trend prediction in sales or other relevant data . (See our discussion of Broader Implications (Section 6) for more discussion of potential economic impact.) Government intelligence is another application that has been considered. For example, it has been suggested that one could monitor sources for increases in hostile or negative communications .
MULTI DOMAIN APPLICATIONS
One exciting turn of events has been the confluence of interest in opinions and sentiment within computer science with interest in opinions and sentiment in other fields. As is well known, opinions matter a great deal in politics. Some work has focused on understanding what voters are thinking [83, 110, 126, 178, 218], whereas other projects have as a long term goal the clarification of politicians’ positions, such as what public figures support or oppose, to enhance the quality of information that voters have access to [27, 111, 295]. Sentiment analysis has specifically been proposed as a key enabling technology in eRulemaking, allowing the automatic analysis of the opinions that people submit about pending policy or government-regulation proposals [50, 175, 272]. On a related note, there has been investigation into opinion mining in weblogs devoted to legal matters, sometimes known as “blawgs” . Interactions with sociology promise to be extremely fruitful. For instance, the issue of how ideas and innovations
diffuse  involves the question of who is positively or negatively disposed towards whom, and hence who would be more or less receptive to new information transmission from a given source. To take just one other example: structural balance theory is centrally concerned with the polarity of “ties” between people  and how this relates to group cohesion. These ideas have begun to be applied to online media analysis [58, 144, inter alia].
ENTITY DETECTION AND ASSIGNMENT
One problem that has not been studied so far is the assignment of entities that have been talked about in each sentence. Let us use forum discussions about products as an example to make the problem concrete. In a typical discussion post, the author may give opinions on multiple products and also compare them. The issue is how to detect what products have been talked about in each sentence. If the sentence contains the product names, they need to be identified. We call this problem entity discovery. If the product names are not explicitly mentioned in the sentence but are implied due to the use of pronouns and language conventions, we need to infer the products. We call this problem entity assignment. These problems are important because without knowing what products each sentence talks about the opinion mined from the sentence is of little use. Entity discovery is based on pattern discovery and entity assignment is based on mining of comparative sentences. Experimental results using a large number of forum posts demonstrate the effectiveness of the technique. These system has also been successfully tested in a commercial setting.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.