Opinion mining and sentiment analysis

更新时间:2023-04-18 20:15:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Foundations and Trends R in

Information Retrieval

Vol.2,Nos.1–2(2008)1–135

c 2008B.Pang an

d L.Lee

DOI:10.1561/1500000001

Opinion Mining and Sentiment Analysis

Bo Pang1and Lillian Lee2

1Yahoo!Research,701First Avenue,Sunnyvale,CA94089,USA,

bopang@316fc6ef524de518964b7dc6

2Computer Science Department,Cornell University,Ithaca,NY14853, USA,llee@316fc6ef524de518964b7dc6

Abstract

An important part of our information-gathering behavior has always been to?nd out what other people think.With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs,new opportunities and challenges arise as people now can,and do,actively use information technologies to seek out and understand the opinions of others.The sudden eruption of activity in the area of opinion mining and sentiment analysis,which deals with the computational treatment of opinion,sentiment,and subjectivity in text,has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a ?rst-class object.

This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems.Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications,as compared to those that are already present in more traditional fact-based analysis.We include material

on summarization of evaluative text and on broader issues regarding privacy,manipulation,and economic impact that the development of opinion-oriented information-access services gives rise to.To facilitate future work,a discussion of available resources,benchmark datasets, and evaluation campaigns is also provided.

1

Introduction

Romance should never begin with sentiment.It should

begin with science and end with a settlement.

—Oscar Wilde,An Ideal Husband

1.1The Demand for Information on Opinions

and Sentiment

“What other people think”has always been an important piece of infor-mation for most of us during the decision-making process.Long before awareness of the World Wide Web became widespread,many of us asked our friends to recommend an auto mechanic or to explain who they were planning to vote for in local elections,requested reference letters regarding job applicants from colleagues,or consulted Consumer Reports to decide what dishwasher to buy.But the Internet and the Web have now(among other things)made it possible to?nd out about the opinions and experiences of those in the vast pool of people that are nei-ther our personal acquaintances nor well-known professional critics—that is,people we have never heard of.And conversely,more and more people are making their opinions available to strangers via the Internet.

1

2Introduction

Indeed,according to two surveys of more than2000American adults each[63,127],

?81%of Internet users(or60%of Americans)have done online

research on a product at least once;

?20%(15%of all Americans)do so on a typical day;

?among readers of online reviews of restaurants,hotels,and

various services(e.g.,travel agencies or doctors),between

73%and87%report that reviews had a signi?cant in?uence

on their purchase;1

?consumers report being willing to pay from20%to99%more

for a5-star-rated item than a4-star-rated item(the variance

stems from what type of item or service is considered);

?32%have provided a rating on a product,service,or per-

son via an online ratings system,and30%(including18%

of online senior citizens)have posted an online comment or

review regarding a product or service.2

We hasten to point out that consumption of goods and services is not the only motivation behind people’s seeking out or expressing opinions online.A need for political information is another important factor.For example,in a survey of over2500American adults,Rainie and Horrigan[248]studied the31%of Americans—over60million people—that were2006campaign internet users,de?ned as those who gathered information about the2006elections online and exchanged views via email.Of these,

?28%said that a major reason for these online activities was

to get perspectives from within their community,and34%

said that a major reason was to get perspectives from outside

their community;

?27%had looked online for the endorsements or ratings of

external organizations;

1Section6.1discusses quantitative analyses of actual economic impact,as opposed to con-sumer perception.

2Interestingly,Hitlin and Rainie[123]report that“Inpiduals who have rated something online are also more skeptical of the information that is available on the Web.”

1.1The Demand for Information on Opinions and Sentiment3

?28%said that most of the sites they use share their point

of view,but29%said that most of the sites they use chal-

lenge their point of view,indicating that many people are not

simply looking for validations of their pre-existing opinions;

and

?8%posted their own political commentary online.

The user hunger for and reliance upon online advice and recom-mendations that the data above reveals is merely one reason behind the surge of interest in new systems that deal directly with opinions as a?rst-class object.But,Horrigan[127]reports that while a majority of American internet users report positive experiences during online prod-uct research,at the same time,58%also report that online information was missing,impossible to?nd,confusing,and/or overwhelming.Thus, there is a clear need to aid consumers of products and of information by building better information-access systems than are currently in existence.

The interest that inpidual users show in online opinions about products and services,and the potential in?uence such opinions wield, is something that vendors of these items are paying more and more attention to[124].The following excerpt from a whitepaper is illustra-tive of the envisioned possibilities,or at the least the rhetoric surround-ing the possibilities:

With the explosion of Web2.0platforms such as blogs,

discussion forums,peer-to-peer networks,and various

other types of social media...consumers have at their

disposal a soapbox of unprecedented reach and power

by which to share their brand experiences and opinions,

positive or negative,regarding any product or service.

As major companies are increasingly coming to realize,

these consumer voices can wield enormous in?uence in

shaping the opinions of other consumers—and,ulti-

mately,their brand loyalties,their purchase decisions,

and their own brand 316fc6ef524de518964b7dc6panies can

respond to the consumer insights they generate through

social media monitoring and analysis by modifying their

4Introduction

marketing messages,brand positioning,product devel-

opment,and other activities accordingly.

—Zabin and Je?eries[327]

But industry analysts note that the leveraging of new media for the purpose of tracking product image requires new technologies;here is a representative snippet describing their concerns:

Marketers have always needed to monitor media for

information related to their brands—whether it’s

for public relations activities,fraud violations,3or

competitive intelligence.But fragmenting media and

changing consumer behavior have crippled traditional

monitoring methods.Technorati estimates that75,000

new blogs are created daily,along with1.2million new

posts each day,many discussing consumer opinions

on products and services.Tactics[of the traditional

sort]such as clipping services,?eld agents,and ad hoc

research simply can’t keep pace.

—Kim[154] Thus,aside from inpiduals,an additional audience for systems capa-ble of automatically analyzing consumer sentiment,as expressed in no small part in online venues,are companies anxious to understand how their products and services are perceived.

1.2What Might be Involved?An Example

Examination of the Construction of

an Opinion/Review Search Engine

Creating systems that can process subjective information e?ectively requires overcoming a number of novel challenges.To illustrate some of these challenges,let us consider the concrete example of what build-ing an opinion-or review-search application could involve.As we have discussed,such an application would?ll an important and prevalent 3Presumably,the author means“the detection or prevention of fraud violations,”as opposed to the commission thereof.

1.2What Might be Involved?5 information need,whether one restricts attention to blog search[213] or considers the more general types of search that have been described above.

The development of a complete review-or opinion-search applica-tion might involve attacking each of the following problems.

(1)If the application is integrated into a general-purpose search

engine,then one would need to determine whether the user

is in fact looking for subjective material.This may or may

not be a di?cult problem in and of itself:perhaps queries of

this type will tend to contain indicator terms like“review,”

“reviews,”or“opinions,”or perhaps the application would

provide a“checkbox”to the user so that he or she could indi-

cate directly that reviews are what is desired;but in general,

query classi?cation is a di?cult problem—indeed,it was

the subject of the2005KDD Cup challenge[185].

(2)Besides the still-open problem of determining which docu-

ments are topically relevant to an opinion-oriented query,

an additional challenge we face in our new setting is

simultaneously or subsequently determining which docu-

ments or portions of documents contain review-like or opin-

ionated material.Sometimes this is relatively easy,as in

texts fetched from review-aggregation sites in which review-

oriented information is presented in relatively stereotyped

format:examples include 316fc6ef524de518964b7dc6 and 316fc6ef524de518964b7dc6.

However,blogs also notoriously contain quite a bit of subjec-

tive content and thus are another obvious place to look(and

are more relevant than shopping sites for queries that con-

cern politics,people,or other non-products),but the desired

material within blogs can vary quite widely in content,style,

presentation,and even level of grammaticality.

(3)Once one has target documents in hand,one is still faced with

the problem of identifying the overall sentiment expressed

by these documents and/or the speci?c opinions regard-

ing particular features or aspects of the items or topics in

question,as necessary.Again,while some sites make this

6Introduction

kind of extraction easier—for instance,user reviews posted

to Yahoo!Movies must specify grades for pre-de?ned sets of

characteristics of?lms—more free-form text can be much

harder for computers to analyze,and indeed can pose addi-

tional challenges;for example,if quotations are included in a

newspaper article,care must be taken to attribute the views

expressed in each quotation to the correct entity.

(4)Finally,the system needs to present the sentiment informa-

tion it has garnered in some reasonable summary fashion.

This can involve some or all of the following actions:

(a)Aggregation of“votes”that may be registered

on di?erent scales(e.g.,one reviewer uses a star

system,but another uses letter grades).

(b)Selective highlighting of some opinions.

(c)Representation of points of disagreement and

points of consensus.

(d)Identi?cation of communities of opinion holders.

(e)Accounting for di?erent levels of authority

among opinion holders.

Note that it might be more appropriate to produce a visual-

ization of sentiment data rather than a textual summary of

it,whereas textual summaries are what is usually created in

standard topic-based multi-document summarization.

1.3Our Charge and Approach

Challenges(2),(3),and(4)in the above list are very active areas of research,and the bulk of this survey is devoted to reviewing work in these three sub-?elds.However,due to space limitations and the focus of the journal series in which this survey appears,we do not and cannot aim to be completely comprehensive.

In particular,when we began to write this survey,we were directly charged to focus on information-access applications,as opposed to work of more purely linguistic interest.We stress that the importance of work in the latter vein is absolutely not in question.

1.4Early History7

Given our mandate,the reader will not be surprised that we describe the applications that sentiment-analysis systems can facilitate and review many kinds of approaches to a variety of opinion-oriented clas-si?cation problems.We have also chosen to attempt to draw attention to single-and multi-document summarization of evaluative text,espe-cially since interesting considerations regarding graphical visualization arise.Finally,we move beyond just the technical issues,devoting sig-ni?cant attention to the broader implications that the development of opinion-oriented information-access services have:we look at questions of privacy,manipulation,and whether or not reviews can have measur-able economic impact.

1.4Early History

Although the area of sentiment analysis and opinion mining has recently enjoyed a huge burst of research activity,there has been a steady undercurrent of interest for quite a while.One could count early projects on beliefs as forerunners of the area[48,317].Later work focused mostly on interpretation of metaphor,narrative,point of view, a?ect,evidentiality in text,and related areas[121,133,149,262,306, 310,311,312,313].

The year2001or so seems to mark the beginning of widespread awareness of the research problems and opportunities that sentiment analysis and opinion mining raise[51,66,69,79,192,215,221,235, 291,296,298,305,326],and subsequently there have been literally hundreds of papers published on the subject.

Factors behind this“land rush”include:

?the rise of machine learning methods in natural language

processing and information retrieval;

?the availability of datasets for machine learning algorithms

to be trained on,due to the blossoming of the World Wide

Web and,speci?cally,the development of review-aggregation

web-sites;and,of course

?realization of the fascinating intellectual challenges and com-

mercial and intelligence applications that the area o?ers.

8Introduction

1.5A Note on Terminology:Opinion Mining,Sentiment

Analysis,Subjectivity,and All that

‘The beginning of wisdom is the de?nition of terms,’

wrote Socrates.The aphorism is highly applicable when

it comes to the world of social media monitoring and

analysis,where any semblance of universal agreement

on terminology is altogether lacking.

Today,vendors,practitioners,and the media alike call

this still-nascent arena everything from‘brand moni-

toring,’‘buzz monitoring’and‘online anthropology,’to

‘market in?uence analytics,’‘conversation mining’and

‘online consumer intelligence’....In the end,the term

‘social media monitoring and analysis’is itself a verbal

crutch.It is placeholder[sic],to be used until something

better(and shorter)takes hold in the English language

to describe the topic of this report.

—Zabin and Je?eries[327] The above quotation highlights the problems that have arisen in trying to name a new area.The quotation is particularly apt in the context of this survey because the?eld of“social media monitoring and analysis”(or however one chooses to refer to it)is precisely one that the body of work we review is very relevant to.And indeed,there has been to date no uniform terminology established for the relatively young?eld we discuss in this survey.In this section,we simply mention some of the terms that are currently in vogue,and attempt to indicate what these terms tend to mean in research papers that the interested reader may encounter.

The body of work we review is that which deals with the computa-tional treatment of(in alphabetical order)opinion,sentiment,and sub-jectivity in text.Such work has come to be known as opinion mining, sentiment analysis,and/or subjectivity analysis.The phrases review mining and appraisal extraction have been used,too,and there are some connections to a?ective computing,where the goals include enabling computers to recognize and express emotions[239].This proliferation of terms re?ects di?erences in the connotations that these terms carry,

1.5A Note on Terminology9 both in their original general-discourse usages4and in the usages that have evolved in the technical literature of several communities.

In1994,Wiebe[311],in?uenced by the writings of the literary theorist Ban?eld[26],centered the idea of subjectivity around that of private states,de?ned by Quirk et al.[245]as states that are not open to objective observation or veri?cation.Opinions,evaluations,emotions, and speculations all fall into this category;but a canonical example of research typically described as a type of subjectivity analysis is the recognition of opinion-oriented language in order to distinguish it from objective language.While there has been some research self-identi?ed as subjectivity analysis on the particular application area of determin-ing the value judgments(e.g.,“four stars”or“C+”)expressed in the evaluative opinions that are found,this application has not tended to be a major focus of such work.

The term opinion mining appears in a paper by Dave et al.[69] that was published in the proceedings of the2003WWW conference; the publication venue may explain the popularity of the term within communities strongly associated with Web search or information retrieval.According to Dave et al.[69],the ideal opinion-mining tool would“process a set of search results for a given item,generating a list of product attributes(quality,features,etc.)and aggregating opinions 4To see that the distinctions in common usage can be subtle,consider how interrelated the following set of de?nitions given in Merriam-Webster’s Online Dictionary are:

Synonyms:opinion,view,belief,conviction,persuasion,sentiment mean

a judgment one holds as true.

?Opinion implies a conclusion thought out yet open to dispute

each expert seemed to have a di?erent opinion .

?View suggests a subjective opinion very assertive in stating

his views .

?Belief implies often deliberate acceptance and intellectual

assent a?rm belief in her party’s platform .

?Conviction applies to a?rmly and seriously held belief the

conviction that animal life is as sacred as human .

?Persuasion suggests a belief grounded on assurance(as by

evidence)of its truth was of the persuasion that everything

changes .

?Sentiment suggests a settled opinion re?ective of one’s feelings

her feminist sentiments are well-known .

10Introduction

about each of them(poor,mixed,good).”Much of the subsequent research self-identi?ed as opinion mining?ts this description in its emphasis on extracting and analyzing judgments on various aspects of given items.However,the term has recently also been interpreted more broadly to include many di?erent types of analysis of evaluative text[190].

The history of the phrase sentiment analysis parallels that of“opin-ion mining”in certain respects.The term“sentiment”used in reference to the automatic analysis of evaluative text and tracking of the predic-tive judgments therein appears in2001papers by Das and Chen[66] and Tong[296],due to these authors’interest in analyzing market senti-ment.It subsequently occurred within2002papers by Turney[298]and Pang et al.[235],which were published in the proceedings of the annual meeting of the Association for Computational Linguistics(ACL)and the annual conference on Empirical Methods in Natural Language Pro-cessing(EMNLP).Moreover,Nasukawa and Yi[221]entitled their2003 paper,“Sentiment analysis:Capturing favorability using natural lan-guage processing”,and a paper in the same year by Yi et al.[323]was named“Sentiment Analyzer:Extracting sentiments about a given topic using natural language processing techniques.”These events together may explain the popularity of“sentiment analysis”among communi-ties self-identi?ed as focused on NLP.A sizeable number of papers mentioning“sentiment analysis”focus on the speci?c application of classifying reviews as to their polarity(either positive or negative),a fact that appears to have caused some authors to suggest that the phrase refers speci?cally to this narrowly de?ned task.However,nowa-days many construe the term more broadly to mean the computational treatment of opinion,sentiment,and subjectivity in text.

Thus,when broad interpretations are applied,“sentiment analysis”and“opinion mining”denote the same?eld of study(which itself can be considered a sub-area of subjectivity analysis).We have attempted to use these terms more or less interchangeably in this survey.This is in no small part because we view the?eld as representing a uni?ed body of work,and would thus like to encourage researchers in the area to share terminology regardless of the publication venues at which their papers might appear.

2

Applications

Sentiment without action is the ruin of the soul.

—Edward Abbey

We used one application of opinion mining and sentiment analysis as a motivating example in the Introduction,namely,web search targeted toward reviews.But other applications abound.In this section,we seek to enumerate some of the possibilities.

It is important to mention that because of all the possible applica-tions,there are a good number of companies,large and small,that have opinion mining and sentiment analysis as part of their mission.How-ever,we have elected not to mention these companies inpidually due to the fact that the industrial landscape tends to change quite rapidly, so that lists of companies risk falling out of date rather quickly.

2.1Applications to Review-Related Websites

Clearly,the same capabilities that a review-oriented search engine would have could also serve very well as the basis for the creation and automated upkeep of review-and opinion-aggregation websites.That is, as an alternative to sites like Epinions that solicit feedback and reviews,

11

12Applications

one could imagine sites that proactively gather such information.Topics need not be restricted to product reviews,but could include opinions about candidates running for o?ce,political issues,and so forth.

There are also applications of the technologies we discuss to more traditional review-solicitation sites,as well.Summarizing user reviews is an important problem.One could also imagine that errors in user ratings could be?xed:there are cases where users have clearly acci-dentally selected a low rating when their review indicates a positive evaluation[47].Moreover,as discussed later in this survey(see Sec-tion5.2.4,for example),there is some evidence that user ratings can be biased or otherwise in need of correction,and automated classi?ers could provide such updates.

2.2Applications as a Sub-Component Technology Sentiment-analysis and opinion-mining systems also have an important potential role as enabling technologies for other systems.

One possibility is as an augmentation to recommendation systems [292,293],since it might behoove such a system not to recommend items that receive a lot of negative feedback.

Detection of“?ames”(overly heated or antagonistic language)in email or other types of communication[276]is another possible use of subjectivity detection and classi?cation.

In online systems that display ads as sidebars,it is helpful to detect webpages that contain sensitive content inappropriate for ads place-ment[137];for more sophisticated systems,it could be useful to bring up product ads when relevant positive sentiments are detected,and per-haps more importantly,nix the ads when relevant negative statements are discovered.

It has also been argued that information extraction can be improved by discarding information found in subjective sentences[256].

Question answering is another area where sentiment analysis can prove useful[274,284,189].For example,opinion-oriented questions may require di?erent treatment.Alternatively,Lita et al.[189]suggest that for de?nitional questions,providing an answer that includes more information about how an entity is viewed may better inform the user.

2.3Applications in Business and Government Intelligence13

Summarization may also bene?t from accounting for multiple view-points[265].

Additionally,there are potentially relations to citation analysis, where,for example,one might wish to determine whether an author is citing a piece of work as supporting evidence or as research that he or she dismisses[238].Similarly,one e?ort seeks to use semantic orientation to track literary reputation[287].

In general,the computational treatment of a?ect has been moti-vated in part by the desire to improve human–computer interaction [188,192,295].

2.3Applications in Business and Government Intelligence The?eld of opinion mining and sentiment analysis is well-suited to various types of intelligence applications.Indeed,business intelligence seems to be one of the main factors behind corporate interest in the ?eld.

Consider,for instance,the following scenario(the text of which also appears in Lee[181]).A major computer manufacturer,disappointed with unexpectedly low sales,?nds itself confronted with the question:“Why aren’t consumers buying our laptop?”While concrete data such as the laptop’s weight or the price of a competitor’s model are obviously relevant,answering this question requires focusing more on people’s personal views of such objective characteristics.Moreover,subjective judgments regarding intangible qualities—e.g.,“the design is tacky”or“customer service was condescending”—or even misperceptions—e.g.,“updated device drivers are not available”when such device drivers do in fact exist—must be taken into account as well.

Sentiment-analysis technologies for extracting opinions from unstructured human-authored documents would be excellent tools for handling many business-intelligence tasks related to the one just described.Continuing with our example scenario:it would be di?cult to try to directly survey laptop purchasers who have not bought the company’s product.Rather,we could employ a system that(a)?nds reviews or other expressions of opinion on the Web—newsgroups, inpidual blogs,and aggregation sites such as Epinions are likely to

14Applications

be productive sources—and then(b)creates condensed versions of inpidual reviews or a digest of overall consensus points.This would save an analyst from having to read potentially dozens or even hun-dreds of versions of the same complaints.Note that Internet sources can vary wildly in form,tenor,and even grammaticality;this fact under-scores the need for robust techniques even when only one language (e.g.,English)is considered.

Besides reputation management and public relations,one might per-haps hope that by tracking public viewpoints,one could perform trend prediction in sales or other relevant data[214].(See our discussion of Broader Implications(Section6)for more discussion of potential eco-nomic impact.)

Government intelligence is another application that has been con-sidered.For example,it has been suggested that one could monitor sources for increases in hostile or negative communications[1].

2.4Applications Across Di?erent Domains

One exciting turn of events has been the con?uence of interest in opin-ions and sentiment within computer science with interest in opinions and sentiment in other?elds.

As is well known,opinions matter a great deal in politics.Some work has focused on understanding what voters are thinking[83,110, 126,178,219],whereas other projects have as a long term goal the clar-i?cation of politicians’positions,such as what public?gures support or oppose,to enhance the quality of information that voters have access to[27,111,294].

Sentiment analysis has speci?cally been proposed as a key enabling technology in eRulemaking,allowing the automatic analysis of the opin-ions that people submit about pending policy or government-regulation proposals[50,175,271].

On a related note,there has been investigation into opinion mining in weblogs devoted to legal matters,sometimes known as“blawgs”[64].

Interactions with sociology promise to be extremely fruitful.For instance,the issue of how ideas and innovations di?use[258]involves the question of who is positively or negatively disposed toward whom,

2.4Applications Across Di?erent Domains15 and hence who would be more or less receptive to new information transmission from a given source.To take just one other example: structural balance theory is centrally concerned with the polarity of“ties”between people[54]and how this relates to group cohe-sion.These ideas have begun to be applied to online media analysis [58,144].

3

General Challenges

3.1Contrasts with Standard Fact-Based Textual Analysis The increasing interest in opinion mining and sentiment analysis is partly due to its potential applications,which we have just discussed. Equally important are the new intellectual challenges that the?eld presents to the research community.So what makes the treatment of evaluative text di?erent from“classic”text mining and fact-based analysis?

Take text categorization,for example.Traditionally,text categoriza-tion seeks to classify documents by topic.There can be many possible categories,the de?nitions of which might be user-and application-dependent;and for a given task,we might be dealing with as few as two classes(binary classi?cation)or as many as thousands of classes (e.g.,classifying documents with respect to a complex taxonomy).In contrast,with sentiment classi?cation(see Section4.1for more details on precise de?nitions),we often have relatively few classes(e.g.,“pos-itive”or“3stars”)that generalize across many domains and users. In addition,while the di?erent classes in topic-based categorization can be completely unrelated,the sentiment labels that are widely

16

3.2Factors that Make Opinion Mining Di?cult17 considered in previous work typically represent opposing(if the task is binary classi?cation)or ordinal/numerical categories(if classi?cation is according to a multi-point scale).In fact,the regression-like nature of strength of feeling,degree of positivity,and so on seems rather unique to sentiment categorization(although one could argue that the same phenomenon exists with respect to topic-based relevance).

There are also many characteristics of answers to opinion-oriented questions that di?er from those for fact-based questions[284].As a result,opinion-oriented information extraction,as a way to approach opinion-oriented question answering,naturally di?ers from traditional information extraction(IE)[49].Interestingly,in a manner that is sim-ilar to the situation for the classes in sentiment-based classi?cation,the templates for opinion-oriented IE also often generalize well across di?er-ent domains,since we are interested in roughly the same set of?elds for each opinion expression(e.g.,holder,type,strength)regardless of the topic.In contrast,traditional IE templates can di?er greatly from one domain to another—the typical template for recording information relevant to a natural disaster is very di?erent from a typical template for storing bibliographic information.

These distinctions might make our problems appear deceptively simpler than their counterparts in fact-based analysis,but this is far from the truth.In the next section,we sample a few examples to show what makes these problems di?cult compared to traditional fact-based text analysis.

3.2Factors that Make Opinion Mining Di?cult

Let us begin with a sentiment polarity text-classi?cation example.Sup-pose we wish to classify an opinionated text as either positive or negative,according to the overall sentiment expressed by the author within it.Is this a di?cult task?

To answer this question,?rst consider the following example, consisting of only one sentence(by Mark Twain):“Jane Austen’s books madden me so that I can’t conceal my frenzy from the reader.”Just as the topic of this text segment can be identi?ed by the phrase“Jane Austen,”the presence of words like“madden”and“frenzy”suggests

18General Challenges

negative sentiment.So one might think this is an easy task,and hypothesize that the polarity of opinions can generally be identi?ed by a set of keywords.

But,the results of an early study by Pang et al.[235]on movie reviews suggest that coming up with the right set of keywords might be less trivial than one might initially think.The purpose of Pang et al.’s pilot study was to better understand the di?culty of the document-level sentiment-polarity classi?cation problem.Two human subjects were asked to pick keywords that they would consider to be good indi-cators of positive and negative sentiment.As shown in Figure3.1,the use of the subjects’lists of keywords achieves about60%accuracy when employed within a straightforward classi?cation policy.In contrast, word lists of the same size but chosen based on examination of the corpus’statistics achieves almost70%accuracy—even though some of the terms,such as“still,”might not look that intuitive at?rst.

However,the fact that it may be non-trivial for humans to come up with the best set of keywords does not in itself imply that the problem is harder than topic-based categorization.While the feature “still”might not be likely for any human to propose from introspection, given training data,its correlation with the positive class can be discovered via a data-driven approach,and its utility(at least in Proposed word lists Accuracy Ties

(%)(%)

5875 Human1positive:dazzling,brilliant,phenomenal,excellent,

fantastic

negative:suck,terrible,awful,unwatchable,

hideous

Human2positive:gripping,mesmerizing,riveting,

6439

spectacular,cool,awesome,thrilling,badass,

excellent,moving,exciting

negative:bad,cliched,sucks,boring,stupid,slow

6916 Statistics-based positive:love,wonderful,best,great,superb,still,

beautiful

negative:bad,worst,stupid,waste,boring,?,!

Fig.3.1Sentiment classi?cation using keyword lists created by human subjects(“Human 1”and“Human2”),with corresponding results using keywords selected via examination of simple statistics of the test data(“Statistics-based”).Adapted from Figures1and2in Pang et al.[235].

3.2Factors that Make Opinion Mining Di?cult19 the movie review domain)does make sense in retrospect.Indeed, applying machine learning techniques based on unigram models can achieve over80%in accuracy[235],which is much better than the per-formance based on hand-picked keywords reported above.However,this level of accuracy is not quite on par with the performance one would expect in typical topic-based binary classi?cation.

Why does this problem appear harder than the traditional task when the two classes we are considering here are so di?erent from each other?Our discussion of algorithms for classi?cation and extraction (Section4)will provide a more in-depth answer to this question,but the following are a few examples(from among the many we know) showing that the upper bound on problem di?culty,from the viewpoint of machines,is very high.Note that not all of the issues these examples raise have been fully addressed in the existing body of work in this area.

Compared to topic,sentiment can often be expressed in a more subtle manner,making it di?cult to be identi?ed by any of a sentence or document’s terms when considered in isolation.Consider the following examples:

?“If you are reading this because it is your darling fragrance,

please wear it at home exclusively,and tape the windows

shut.”(review by Luca Turin and Tania Sanchez of the

Givenchy perfume Amarige,in Perfumes:The Guide,Viking

2008.)No ostensibly negative words occur.

?“She runs the gamut of emotions from A to B.”(Dorothy

Parker,speaking about Katharine Hepburn.)No ostensibly

negative words occur.

In fact,the example that opens this section,which was taken from the following quote from Mark Twain,is also followed by a sentence with no ostensibly negative words:

Jane Austen’s books madden me so that I can’t conceal

my frenzy from the reader.Everytime I read‘Pride and

Prejudice’I want to dig her up and beat her over the

skull with her own shin-bone.

本文来源:https://www.bwwdw.com/article/k3eq.html

Top