这是用户在 2024-5-1 12:28 为 https://app.immersivetranslate.com/pdf-pro/7ee582fe-8443-4470-b8d2-0d852d170763 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Current State of Text Sentiment Analysis from Opinion to Emotion Mining

University of Alberta 阿尔伯塔大学

Abstract 摘要

Sentiment analysis from text consists of extracting information about opinions, sentiments, and even emotions conveyed by writers towards topics of interest. It is often equated to opinion mining, but it should also encompass emotion mining. Opinion mining involves the use of natural language processing and machine learning to determine the attitude of a writer towards a subject. Emotion mining is also using similar technologies but is concerned with detecting and classifying writers emotions toward events or topics. Textual emotion-mining methods have various applications, including gaining information about customer satisfaction, helping in selecting teaching materials in e-learning, recommending products based on users emotions, and even predicting mental-health disorders. In surveys on sentiment analysis, which are often old or incomplete, the strong link between opinion mining and emotion mining is understated. This motivates the need for a different and new perspective on the literature on sentiment analysis, with a focus on emotion mining. We present the state-of-the-art methods and propose the following contributions: (1) a taxonomy of sentiment analysis; (2) a survey on polarity classification methods and resources, especially those related to emotion mining; (3) a complete survey on emotion theories and emotion-mining research; and (4) some useful resources, including lexicons and datasets.

CCS Concepts: General and reference Surveys and overviews; Information systems Data mining;
综合传播战略概念: 一般和参考 调查和概述; 信息系统 数据挖掘;
Additional Key Words and Phrases: Emotion detection, text mining, polarity classification, opinion mining, sentiment analysis, data mining, machine learning

ACM Reference Format: ACM 参考格式:

Ali Yadollahi, Ameneh Gholipour Shahraki, and Osmar R. Zaiane. 2017. Current state of text sentiment analysis from opinion to emotion mining. ACM Comput. Surv. 50, 2, Article 25 (May 2017), 33 pages.
Ali Yadollahi, Ameneh Gholipour Shahraki, and Osmar R. Zaiane.2017.从观点到情感挖掘的文本情感分析现状。ACM Comput.Surv.50,2,第 25 条(2017 年 5 月),33 页。


"Sentiment analysis," one of the fields in "affective computing," refers to all the areas of detecting, analyzing, and evaluating humans' state of mind towards different events, issues, services, or any other interest. More precisely, this field aims to mine opinions, sentiments, and emotions based on observations of people's actions that can be captured using their writings, facial expressions, speech, music, movements, and so on. Analysis of sentiments from each of these media is a specific field of study. Here we focus only on text sentiment analysis. For further information regarding other types of sentiment analysis, one can refer to Yang and Chen [2012], El Ayadi et al. [2011], Zeng et al. [2009], Kleinsmith and Bianchi-Berthouze [2013], and D'mello and Kory [2015].
"情感分析 "是 "情感计算 "的领域之一,指的是检测、分析和评估人类对不同事件、问题、服务或任何其他兴趣的心理状态的所有领域。更确切地说,这一领域旨在通过观察人们的行为来挖掘意见、情感和情绪,这些行为可以通过人们的文章、面部表情、语言、音乐、动作等捕捉到。对上述每种媒体的情感分析都是一个特定的研究领域。在此,我们只关注文本情感分析。有关其他类型情感分析的详细信息,可参考 Yang 和 Chen [2012]、El Ayadi 等人 [2011]、Zeng 等人 [2009]、Kleinsmith 和 Bianchi-Berthouze [2013] 以及 D'mello 和 Kory [2015]。
Fig. 1. Taxonomy of sentiment analysis tasks.
图 1.情感分析任务分类学。
Text sentiment analysis has been an attractive topic of study since the mid-1990s; however, there barely exists a systematic organization of tasks under this area and people use different terms to refer to different tasks. For example, sentiment analysis, opinion mining, and polarity classification, which we will define below, are used to address the same concept, while this is not sound either lexically or semantically. This is why having a clear definition of terms and a logical taxonomy of sentiment analysis work is one of our concerns.
自 20 世纪 90 年代中期以来,文本情感分析一直是一个颇具吸引力的研究课题;然而,这一领域的任务几乎没有系统的组织,人们使用不同的术语来指代不同的任务。例如,情感分析、舆情挖掘和极性分类(我们将在下文对其进行定义)被用来处理同一个概念,而这在词义和语义上都是不正确的。因此,明确术语定义和情感分析工作的逻辑分类是我们关注的问题之一。
According to the definition in the Merriam-Webster dictionary, sentiment is an attitude, thought, or judgment prompted by a feeling. In other words, sentiment is an opinion or idea colored by an emotion. Therefore, analyzing the sentiment of a unit of text can encompass investigating both the opinion and the emotion behind that unit.
It is easy to confuse opinion and emotion, since they have a strong correlation. For instance, in many situations emotion motivates a person to judge an entity and build opinions about it. Additionally, opinion of a person can cause emotions in others. However, a text unit can indicate contradicting opinions and emotions. For instance, the sentence "My family thinks it's a good decision to continue my education overseas, though they feel sad to miss me" represents a positive opinion and a negative emotion toward the same topic.
人们很容易混淆观点和情感,因为它们之间有很强的相关性。例如,在许多情况下,情感会促使一个人对一个实体做出判断,并对其产生看法。此外,一个人的观点也会引起其他人的情绪。然而,一个文本单元可以表示相互矛盾的观点和情感。例如,句子 "我的家人认为我继续在海外求学是个不错的决定,尽管他们会因为想念我而感到难过 "代表了对同一话题的积极观点和消极情绪。
Based on the aforementioned reasons, we categorize the field of sentiment analysis into two parts: (1) opinion mining, dealing with the expression of opinions, and (2) emotion mining, concerned with the articulation of emotions. Opinion mining is more concerned with the concept of opinions expressed in texts that can be positive, negative, or neutral, while emotion mining is the study of emotions (e.g., joy, sadness) reflected in a piece of text. Hence, to have a sound terminology of problems, we should discriminate them. Figure 1 shows the categorization of sentiment analysis to these two tasks and the subtasks of each. These subtasks are defined as follows.
基于上述原因,我们将情感分析领域分为两部分:(1) 意见挖掘,涉及意见的表达;(2) 情感挖掘,关注情感的表达。意见挖掘更关注文本中表达的意见概念,这些意见可以是正面的、负面的或中性的;而情感挖掘则是研究文本中反映的情感(如喜悦、悲伤)。因此,为了有一个合理的问题术语,我们应该对它们进行区分。图 1 显示了情感分析对这两个任务的分类以及每个任务的子任务。这些子任务的定义如下。

Opinion-mining tasks: 意见挖掘任务:

-Subjectivity Detection: The task of detecting if a text is objective or subjective. Objective texts carry some factual information, for example, "The sky is blue," while subjective texts express somebody's personal views or opinions, for example, "I like the color blue" [Liu 2011].
-主观性检测:检测文本是客观文本还是主观文本。客观文本包含一些事实信息,例如 "天空是蓝色的",而主观文本则表达某人的个人观点或意见,例如 "我喜欢蓝色"[Liu 2011]。
-Opinion Polarity Classification: The task of determining whether the text expresses positive or negative (or sometimes neutral) opinion. As mentioned above, "sentiment analysis" and "opinion mining" are used as synonyms of "polarity classification," which is restrictive. Section 2 of this article discusses many of the previous works corresponding to this subtask.
-观点极性分类:确定文本表达的是正面还是负面(有时是中性)意见。如上所述,"情感分析 "和 "观点挖掘 "被用作 "极性分类 "的同义词,这是有局限性的。本文第 2 节讨论了与该子任务相对应的许多前人工作。
-Opinion Spam Detection: The task of detecting fake opinions in favor of or against a product or service that malicious users intentionally write to make their target popular or unpopular. The work of Jindal and Liu [2008] is one of the first attempts with promising results in this area of study.
-垃圾意见检测:检测恶意用户为使其目标受欢迎或不受欢迎而故意撰写的支持或反对产品或服务的虚假意见。Jindal 和 Liu [2008] 的研究是这一研究领域的首次尝试,并取得了可喜的成果。
  • Opinion Summarization: The task of summarizing a large bunch of opinions toward a topic, encompassing different perspectives, aspects, and polarities. This is important specifically when someone wants to make a decision, because a single opinion cannot be trustworthy. The work of and Liu [2004] is an example of opinion summarization on product reviews.
    意见汇总:意见汇总:汇总针对某一主题的大量意见,包括不同的观点、方面和两极。这在某人想要做决定时尤为重要,因为单一的意见是不可信的。 和 Liu [2004] 的工作就是对产品评论进行意见总结的一个例子。
-Argument Expression Detection: The task of identifying argumentative structures and the relationship between different arguments within a document, such as one being opposed to the other. The work of Lin et al. [2006] is one of the interesting previous works for one to read.
-论点表达检测:识别文档中的论证结构和不同论点之间的关系,如一个论点与另一个论点的对立关系。Lin 等人[2006]的工作是值得一读的前人工作之一。

Emotion-mining task: 情感挖掘任务:

-Emotion Detection: The task of detecting if a text conveys any type of emotion or not. This is similar to subjectivity detection for opinions and is addressed in Gupta et al. [2013].
-情感检测:检测文本是否传达了任何类型的情感。这与意见的主观性检测类似,Gupta 等人[2013]对此进行了研究。
-Emotion Polarity Classification: The task of determining the polarity of the existing emotion in a text, assuming that it has some. This is similar to opinion polarity classification. Examples of this study can be found in Alm et al. [2005] and Hancock et al. [2007].
-情感极性分类:假定文本中有情感,确定文本中现有情感的极性。这与观点极性分类类似。Alm 等人[2005]和 Hancock 等人[2007]中都有这方面的研究实例。
-Emotion Classification: The task of fine-grained classification of existing emotion in a text into one (or more) of a set of defined emotions. Most of the literature that we elaborate on later in this article falls into this category.
-Emotion Cause Detection: The task of mining factors for eliciting some kinds of emotions, as in the early work by Lee et al. [2010] and a later work by Gao et al. [2015b].
-情感原因检测:如 Lee 等人[2010]的早期工作和 Gao 等人[2015b]的后期工作,挖掘激发某种情绪的因素。
As can be inferred from the definitions, we discriminate the words "detection" and "classification." The answer to a detection problem (of an opinion or emotion) is yes or no, meaning that there exists any opinion or emotion in the text or not. However, the answer to a classification problem is the exact type of opinion (positive, negative) or emotion (joy, sadness, etc.) of the target text.
从定义中可以推断出,我们对 "检测 "和 "分类 "进行了区分。检测问题(观点或情感)的答案是 "是 "或 "否",即文本中是否存在任何观点或情感。然而,分类问题的答案是目标文本的确切观点类型(正面、负面)或情感类型(喜悦、悲伤等)。
Besides, Figure 1 shows the discrimination among the terms sentiment analysis, opinion mining, and polarity classification. In the literature, all these terms are used to refer to the problem of opinion polarity classification; however, we see that opinion polarity classification is a subtask of opinion mining, where opinion mining, in turn, is a subtask of sentiment analysis. In this article, we use each term for its exact and specific task and differentiate among them.
此外,图 1 显示了情感分析、舆情挖掘和极性分类这三个术语之间的区别。在文献中,所有这些术语都被用来指代舆情极性分类问题;然而,我们看到,舆情极性分类是舆情挖掘的子任务,而舆情挖掘又是情感分析的子任务。在本文中,我们使用每个术语来表达其确切和具体的任务,并对它们加以区分。

1.1. Motivation 1.1.动机

As Figure 1 shows, there is a rich body of research on opinion mining, and many focused and specialized areas are investigated, while emotion mining from text is still in its infancy and still has a long way to proceed. Emotion mining is an interesting topic in many disciplines such as neuroscience, cognitive sciences, and psychology. Only recently has it attracted attention in computer science. Developing systems that
如图 1 所示,舆情挖掘的研究成果十分丰富,许多重点和专业领域都得到了研究,而从文本中挖掘情感仍处于起步阶段,还有很长的路要走。情感挖掘是神经科学、认知科学和心理学等许多学科的一个有趣课题。直到最近,它才引起了计算机科学的关注。开发可

can detect emotions from text has many potential applications. In customer care services, emotion mining can help marketers gain information about how much satisfied their customers are and what aspects of their service should be improved or revised to consequently make a strong relationship with their end users [Gupta et al. 2013]. Users' emotions can additionally be used for sale predictions of a particular product. In e-learning applications, the Intelligent Tutoring System can decide on teaching materials, based on user's feelings and mental state. In Human Computer Interaction, the computer can monitor user's emotions to suggest suitable music or movies [Voeffray 2011]. Having the technology of identifying emotions enables new textual access approaches such as allowing users to filter results of a search by emotion. In addition, output of an emotion-mining system can serve as input to other systems. For instance, Rangel and Rosso [2016] use the emotions detected in the text for author profiling, specifically identifying the writer's age and gender. Last but not least, psychologists can infer patients' emotions and predict their state of mind accordingly. On a longer period of time, they are able to detect if a patient is facing depression or stress [De Choudhury et al. 2013] or even thinks about committing suicide, which is extremely useful, since he/she can be referred to counseling services [Luyckx et al. 2012].
情感挖掘技术可以从文本中检测出情感,具有许多潜在的应用领域。在客户关怀服务中,情感挖掘可以帮助营销人员获得有关客户满意度的信息,以及他们的服务在哪些方面需要改进或修正,从而与最终用户建立牢固的关系[Gupta 等人,2013 年]。此外,用户情绪还可用于特定产品的销售预测。在电子学习应用中,智能辅导系统可以根据用户的情感和心理状态决定教学材料。在人机交互中,计算机可以通过监测用户的情绪来推荐合适的音乐或电影[Voeffray,2011]。有了识别情绪的技术,就可以采用新的文本访问方法,例如允许用户按情绪过滤搜索结果。此外,情感挖掘系统的输出可以作为其他系统的输入。例如,Rangel 和 Rosso [2016] 将文本中检测到的情绪用于作者分析,特别是识别作者的年龄和性别。最后但并非最不重要的一点是,心理学家可以推断病人的情绪,并据此预测他们的心理状态。在较长的一段时间内,他们能够检测出患者是否面临抑郁或压力[De Choudhury 等人,2013],甚至想到自杀,这非常有用,因为他/她可以被转介到咨询服务机构[Luyckx 等人,2012]。
On the other hand, with the explosive growth of web 2.0 technology, different media are available for people to express themselves and their feelings. This has added another aspect to the area. There is research on detecting emotions from text, facial expressions, images, speeches, paintings, songs, and other sorts of media [Busso et al. 2004; Wieczorkowska et al. 2006]. Among all, facial expressions and voice recorded speeches contain the most dominant clues and have widely been studied. There are also studies on combination of different types of information such as features from text and image including the work of Zhang et al. [2015]. Here we focus only on text and therefore cannot take advantage of the information conveyed via facial or audio channels. Personal notes, emails, news headlines, blogs, tales, novels, and chat messages are some types of text that can convey emotions. Particularly, popular social networking websites such as Twitter, Facebook, and MySpace are appropriate places to share one's feelings easily and widely.
另一方面,随着 Web 2.0 技术的爆炸式增长,人们可以利用不同的媒体来表达自己和自己的感受。这为该领域增添了新的内容。关于从文本、面部表情、图像、演讲、绘画、歌曲和其他媒体中检测情感的研究层出不穷 [Busso 等人,2004 年;Wieczorkowska 等人,2006 年]。其中,面部表情和录音演讲包含最主要的线索,已被广泛研究。也有研究将不同类型的信息结合起来,如来自文本和图像的特征,包括 Zhang 等人的研究 [2015]。在这里,我们只关注文本,因此无法利用通过面部或音频渠道传递的信息。个人笔记、电子邮件、新闻标题、博客、故事、小说和聊天信息都是可以传达情感的文本类型。特别是 Twitter、Facebook 和 MySpace 等流行的社交网站,是轻松、广泛地分享个人情感的合适场所。
There exist some comprehensive surveys on sentiment analysis by Pang and Lee [2008] and Liu [2012], where the latter was expanded in Liu [2015]. While methods and techniques discussed in these articles can be applied to the field of emotion mining as well, none of them have specific coverage of this task. There are also some surveys focusing on emotion mining, such as the works by Kao et al. [2009] and Jain and Kulkarni [2014], but they are rather incomplete. In addition, most of the works on emotion mining do not consider the strong link between emotion and opinion mining. In fact, many of the methods and techniques used in opinion mining can also be applied to emotion-mining problems. These facts motivate us to cover the state-of-the-art methods and resources developed for this popular task by taking a sentiment-analysis-oriented perspective to be a complementary to existing sentiment analysis surveys.
庞和李(Pang and Lee)[2008] 以及刘(Liu)[2012] 对情感分析进行了全面研究,其中刘(Liu)[2015] 对情感分析进行了扩展。虽然这些文章中讨论的方法和技术也可应用于情感挖掘领域,但它们都没有具体涉及这一任务。也有一些侧重于情感挖掘的调查报告,如 Kao 等人[2009]和 Jain 与 Kulkarni [2014] 的作品,但都相当不完整。此外,大多数关于情感挖掘的著作都没有考虑到情感与舆情挖掘之间的紧密联系。事实上,许多用于意见挖掘的方法和技术也可应用于情感挖掘问题。这些事实促使我们从情感分析的角度出发,介绍为这一热门任务开发的最新方法和资源,以作为现有情感分析调查的补充。
In addition, as shown in Figure 1, polarity classification can be applied to both opinion and emotion; however, in the literature it is almost always referring to opinion polarity classification. For instance, Pang and Lee [2008] mention: "The binary classification task of labelling an opinionated document as expressing either an overall positive or an overall negative opinion is called sentiment polarity classification or polarity classification." Nevertheless, proposed techniques and methods are useful for emotion polarity classification as well for two reasons: (1) opinion and emotion are semantically related concepts. Generally, having an opinion towards an entity can cause the person to feel an emotion in the same direction (positive or negative), and (2) these techniques often do not have any opinion-specific characteristic, and, hence, they can directly be
此外,如图 1 所示,极性分类既可用于观点分类,也可用于情感分类;但在文献中,极性分类几乎总是指观点极性分类。例如,Pang 和 Lee [2008] 提到:"将意见文档标注为表达总体正面或总体负面意见的二元分类任务称为情感极性分类或极性分类"。尽管如此,所提出的技术和方法对情感极性分类也很有用,原因有二:(1) 观点和情感是语义相关的概念。一般来说,对某一实体的看法会使人产生相同方向(积极或消极)的情感;(2)这些技术通常不具有任何针对看法的特征,因此可以直接用于情感极性分类。

applied to emotionally labeled problems, too. Considering this inference, we believe it is worth reviewing polarity classification methods before entering emotion research.
This article is organized as follows: In Section 2, key elements of the polarity classification task are explained, and those works in this area that can be useful for the emotion-mining task are reviewed. In Section 3, a set of important resources, including lexicons and datasets that researchers need for a polarity classification task, are introduced. Reviewing emotion theories in order to gain knowledge about basic emotions is done is Section 4. A thorough survey on emotion-related research is given in Section 5. Section 6 is dedicated to introducing useful resources specific to emotion-mining work, and, finally, Section 7 summarizes and concludes the discussion.
本文的结构如下:第 2 节解释了极性分类任务的关键要素,并回顾了该领域中对情感挖掘任务有用的工作。第 3 节介绍了极性分类任务所需的一系列重要资源,包括词典和数据集。第 4 节回顾了情感理论,以获得有关基本情感的知识。第 5 节对与情感相关的研究进行了全面调查。第 6 节专门介绍情感挖掘工作所需的有用资源,最后,第 7 节总结并结束讨论。


Polarity classification is the task of classifying the opinion of a given text as falling under one of two opposing sentiment polarities, the most famous of which is "like" vs. "dislike" [Pang and Lee 2008]. Although much of the work in this area has been done on products and services reviews, which mostly hold positive or negative opinions, there are other problems where "like" or "dislike" are interpreted as other concepts such as different political views [Pang and Lee 2008]. As stated in Section 1, different media can be used to express opinions, among which we only focus on text. For more information about other types of polarity classification, one can refer to Morency et al. [2011].
极性分类是将给定文本的观点归类为两种对立情感极性之一的任务,其中最著名的是 "喜欢 "与 "不喜欢"[Pang 和 Lee,2008 年]。虽然这一领域的大部分工作都是针对产品和服务评论进行的,这些评论大多持有正面或负面的观点,但在其他问题中,"喜欢 "或 "不喜欢 "也被解释为其他概念,如不同的政治观点[Pang and Lee 2008]。如第 1 节所述,不同的媒体可用于表达意见,其中我们只关注文本。有关其他类型极性分类的更多信息,请参阅 Morency 等人 [2011]。
Automatic classification of polarity can be categorized with respect to various perspectives. In terms of granularity, it can be done on a document, sentence, or aspect level.
-Document level: In this category, the whole document, whether short or long, is the atomic unit of input to the problem, and the polarity of the whole document is the essence of the study. Document-level polarity classification concerns most of the body of the work for this area and is considered the simplest sentiment analysis task in the research community [Liu 2015]. At the same time, it is widely demanding, since most of the online data includes documents such as reviews, blog posts, and comments. Document-level polarity classification is an essential requirement for studies such as social and psychological studies in social networks [Ortigosa et al. 2014; Gao et al. 2015a], consumer satisfaction [Kang and Park 2014], and analyzing patients in medical settings [Denecke and Deng 2015].
-文件层面:在这一类别中,整篇文档,无论长短,都是问题输入的原子单位,整篇文档的极性是研究的本质。文档级极性分类涉及该领域的大部分工作,被认为是研究界最简单的情感分析任务[Liu 2015]。同时,由于大多数在线数据都包括评论、博文和评论等文档,因此对其要求很高。文档级极性分类是社交网络中的社会和心理研究[Ortigosa 等人,2014 年;Gao 等人,2015a]、消费者满意度[Kang 和 Park,2014 年]以及医疗环境中的患者分析[Denecke 和 Deng,2015 年]等研究的基本要求。
-Sentence level: The objective of this group of studies is to determine the polarity of a sentence. As noted in Neviarouskaya et al. [2007], a challenge at this level is the influence of the surrounding context on the sentence. For example, depending on what context it is used, the sentence "I can't really describe this product better than this" can be both positive and negative. Polarity classification of tweets, which has been extensively studied in the recent years, is the most interesting application of sentence-level polarity classification.
-句子层面:这组研究的目的是确定句子的极性。正如 Neviarouskaya 等人[2007]所指出的,这一层面的挑战在于周围语境对句子的影响。例如,根据语境的不同,"我真的无法用比这更好的方式来描述这个产品 "这句话既可以是正面的,也可以是负面的。近年来,人们对推文的极性分类进行了广泛研究,这也是句子级极性分类最有趣的应用。
-Aspect level: This category, also known as feature-based opinion mining, encompasses the study of discovering opinion polarities about a specific aspect of a product or service. For instance, opinions on restaurants can be about two aspects of quality, namely the food and the cleanliness of the restaurant. This category of works is highly useful for business owners and politicians to gain insights about aggregations of people's opinions regarding various features of their product and services, where document- or sentence-level classifications do not suffice.
Extraction of aspects from text and polarity classification of the extracted aspects are the two major components of aspect-level polarity classification. The work of and Liu [2004] is one of the earliest in this field. Further attempts mostly focused
从文本中提取方面和对提取的方面进行极性分类是方面级极性分类的两个主要组成部分。 和 Liu [2004] 的研究是该领域最早的研究之一。进一步的尝试主要集中在

on enhancing only one of these components. For instance, one of the most important group of works in this category is devoted to utilizing topic modeling in aspect extraction such as the work of Lin and He [2009], Jo and Oh [2011], Mukherjee and Liu [2012], and Wang et al. [2016].
这些工作只加强了其中的一个部分。例如,该类别中最重要的一组工作致力于在方面提取中利用主题建模,如 Lin 和 He [2009]、Jo 和 Oh [2011]、Mukherjee 和 Liu [2012] 以及 Wang 等人 [2016] 的工作。
With respect to the nature of the data, there are two important modes of the problem. Some datasets benefit from being annotated by a human, while there are many unlabeled datasets of reviews and posts. Methods working with labeled data often show better results; nevertheless, they require manual labeling, which one might be unable to afford. In the following two subsections, we discuss previous methods on annotated and unannotated text data, respectively.

2.1. Works on Annotated Data

The algorithms that deal with labeled data are called "supervised methods." Supervised methods apply some machine-learning algorithms on a set of training data to be able to predict the label of unseen test data. They need an annotated dataset of texts for the task of training, which creates a model to discriminate between polarities.
处理标记数据的算法称为 "监督方法"。有监督方法在一组训练数据上应用一些机器学习算法,以预测未见测试数据的标签。它们需要一个文本注释数据集来完成训练任务,从而创建一个模型来区分极性。
In order to apply machine-learning methods, one should represent the text by means of descriptive features. After that, some techniques should be used to train a polarity classifier. Most solutions introduced in the literature are general-purpose machinelearning techniques, while some of them are sentiment specific. Sebastiani [2002] was the first to apply general text categorization algorithms on the field of sentiment detection. Later, Pang et al. [2002] compared performance of Support Vector Machine (SVM) and Naïve Bayes against each other for movie reviews.
为了应用机器学习方法,我们应该通过描述性特征来表示文本。然后,应使用一些技术来训练极性分类器。文献中介绍的大多数解决方案都是通用的机器学习技术,而其中一些是针对特定情感的。Sebastiani [2002] 是第一个将通用文本分类算法应用于情感检测领域的人。随后,Pang 等人[2002]比较了支持向量机(SVM)和奈夫贝叶斯在电影评论方面的性能。
Representation learning methods have shown promising classification results in various applications, one of which is the polarity classification. Socher et al. [2013] utilize deep learning to train a Treebank sentiment classifier, Tang et al. [2014a] develop a deep learning Twitter sentiment model, dos Santos and Gatti [2014] apply Deep Convolutional Neural Networks on classifying short text, Tang et al. [2014b] develop neural networks to find continuous word representation along with the sentiment of the word, and Tang [2015] attempts to encapsulate features of a document using cascaded constitutes and to learn sentiment of documents. All these works attempt to find a representation of the polarity by applying various layers of hidden nodes among which the first layer consists of the raw features of the text.
表征学习方法在各种应用中都显示出良好的分类效果,极性分类就是其中之一。Socher 等人[2013]利用深度学习训练树库情感分类器,Tang 等人[2014a]开发了深度学习 Twitter 情感模型,dos Santos 和 Gatti [2014]将深度卷积神经网络应用于短文分类,Tang 等人[2014b]开发了神经网络来寻找连续的单词表示以及单词的情感,Tang 等人[2015]试图利用级联构成封装文档的特征并学习文档的情感。所有这些工作都试图通过应用不同层的隐藏节点来找到极性的表示,其中第一层由文本的原始特征组成。
A fairly large part of the literature is dedicated to finding out the usefulness of many features and techniques in learning. The most common types of those features, which have been also applied in other areas of text mining, are as follows.
2.1.1. Presence-Based and Frequency-Based Features. The most common way to describe a piece of text is by using a binary vector in which each element corresponds to one term from a dictionary. The element at index in the vector is set to 1 if the term is present in the text and is 0 otherwise. Likewise, one may describe the text as a vector representing the number of times individual terms have been repeated. The former is called the presence-based and the latter is named the frequency-based type of feature. Although term frequency is a popular feature in information retrieval, Pang et al. [2002] obtain better performance when using presence-based features.
2.1.1.基于存在和基于频率的特征。描述一段文本最常用的方法是使用二进制向量,其中每个元素对应字典中的一个术语。如果文本中出现术语 ,则向量中索引 处的元素设为 1,否则设为 0。同样,我们也可以将文本描述为一个向量,表示各个术语重复出现的次数。前者称为基于存在的特征,后者称为基于词频的特征。虽然词频是信息检索中常用的特征,但 Pang 等人[2002]在使用基于存在的特征时获得了更好的性能。
2.1.2. Unigram and -Gram Features. A unigram refers to one single word in a text and an n-gram represents a group of adjacent words in a sentence, preserving the order. Although n-grams have more information than unigram features, concerning the position of words in the sentence and being used as a group, them being more effective in increasing the performance is a matter of some debate. For instance, Pang et al. [2002] report that unigrams are more effective than n-grams; however, some
2.1.2.单字元和 -Gram 特征。单字元指的是文本中的一个单词,而 n-gram 代表的是句子中一组保持顺序的相邻单词。虽然 n-gram 比 unigram 特征拥有更多关于单词在句子中位置的信息,但它们是否能更有效地提高性能还存在争议。例如,Pang 等人[2002]报告说,单字符比 n-grams 更有效;但也有一些人认为,n-grams 在提高性能方面更有效。

other research such as the work of Dave et al. [2003] indicate better results for the combination of bigrams and trigrams.
2.1.3. Part of Speech. Some types of words are more likely to carry information about the polarity of a sentence or document, and, hence, part of speech can be a good discriminator in order to detect such words. It is indicated in previous works that adjectives are very important in determining the sense of the text. In fact, adjectives can be used both as main features, such as in works by Mullen and Collier [2004] and Whitelaw et al. [2005], and as filters for selecting other features. For instance, Turney uses adjectives to detect a set of phrases as features and then determines the polarity of documents based on those features [Turney 2002].
2.1.3.语篇。某些类型的词更有可能携带有关句子或文档极性的信息,因此,语篇可以成为检测这类词的良好判别器。前人的研究表明,形容词对于确定文本的意义非常重要。事实上,形容词既可以用作主要特征(如 Mullen 和 Collier [2004] 以及 Whitelaw 等人 [2005] 的作品),也可以用作选择其他特征的过滤器。例如,Turney 使用形容词检测一组短语作为特征,然后根据这些特征确定文档的极性[Turney 2002]。
In addition to adjectives, other part-of-speech tags such as nouns like "gem" or verbs such as "love" can improve the performance of the task [Pang and Lee 2008]. Some previous works focus on comparing the effectiveness of adjectives, adverbs, verbs, and nouns in the classification task, including Benamara et al. [2007], Nasukawa and Yi [2003], and Wiebe et al. [2004].
除形容词外,其他语篇标签,如 "gem "等名词或 "love "等动词也能提高任务的性能[Pang and Lee 2008]。之前的一些研究主要是比较形容词、副词、动词和名词在分类任务中的效果,包括 Benamara 等人 [2007]、Nasukawa 和 Yi [2003] 以及 Wiebe 等人 [2004]。
2.1.4. Syntax. Several researchers investigate usage of dependency-based features by using dependency trees [Liu 2011]. There are contradicting results regarding the effectiveness of dependencies in text in previous works. Slight improvements in performance are reported in Dave et al. [2003], Gamon [2004], while Ng et al. [2006] conclude that addition of dependency-based features does not offer any improvements over the simple n-gram-based classifier.
2.1.4.语法。一些研究人员通过使用依赖关系树来研究依赖关系特征的使用方法[Liu 2011]。关于文本中依赖关系的有效性,以往的研究结果相互矛盾。Dave 等人[2003]和 Gamon[2004]的研究报告称,依赖性特征的性能略有提高,而 Ng 等人[2006]的研究报告则认为,与基于 n-gram 的简单分类器相比,依赖性特征的添加并没有带来任何改进。
2.1.5. Negation. The use of negating words in a sentence may totally flip the polarity of that sentence. For instance, ignoring "not" in "He does not like the color blue" results in a false positive. Attaching "not" to the words occurring near the negating words is one of the elementary techniques done for the first time by Das and Chen [2001]. Although the naïve assumption that each negation word flips the polarity of a window of following words is working in many cases, it is not a general rule. Later works try to optimize this technique by reversing the polarity of the phrases based on the part-of-speech tag patterns [Na et al. 2004].
2.1.5.否定词。在句子中使用否定词可能会完全改变句子的极性。例如,忽略 "他不喜欢蓝色 "中的 "不 "会导致误判。给否定词附近出现的词加上 "不 "是 Das 和 Chen [2001] 首次提出的基本技术之一。虽然每一个否定词都会翻转后面一个词窗口的极性这一天真假设在很多情况下是可行的,但它并不是一个普遍规律。后来的研究试图优化这一技术,根据语篇标签模式反转短语的极性[Na 等人,2004]。
Besides the explicit negation words, there are other terms that may negate a sentence. For instance, the verb "prevent" in the sentence "They prevent keeping unhealthy foods in the store" and the verb "deny" in "She denies admiring the brand" are implicitly reversing the polarity.
除了显性否定词语外,还有其他一些词语可以否定句子。例如,句子 "他们防止在商店里存放不健康食品 "中的动词 "防止 "和 "她否认欣赏这个品牌 "中的动词 "否认 "就暗含着极性相反的意思。
2.1.6. Topic-Oriented Features. Sentiment of a given sentence may be topic specific. For instance, the word "fast" in the context of car reviews is considered as positive, while it may be considered as negative in movie reviews. Different features are investigated based on topic in the literature specially in the work of Mullen and Collier [2004].
2.1.6.主题导向特征。特定句子的情感可能与主题相关。例如,"快 "一词在汽车评论中被认为是正面的,而在电影评论中可能被认为是负面的。文献中,特别是 Mullen 和 Collier [2004] 的著作中,根据主题研究了不同的特征。

2.2. Works on Unannotated Data

It is obvious that coming up with a solution for unannotated data is always harder because of the lack of labels compared to annotated ones. In fact, most of the informative and also subjective text formats, such as comments, reviews and news, are left unlabeled, and hence there is no avail to using them for the purpose of training a classifier.
Researchers try to tackle the problem of unlabeled data from a wide range of perspectives. We have categorized the related methods in three different groups. The first group of solutions aims to expand a lexicon of words that contains words and their prior polarity and are explained in Section 2.2.1. The second group is concerned with domain adaptation, which is described in Section 2.2.2. Most of the works on unannotated data
研究人员试图从多个角度解决无标记数据问题。我们将相关方法分为三类。第一组解决方案旨在扩充包含词语及其先验极性的词语词典,详见第 2.2.1 节。第二组涉及领域适应,将在第 2.2.2 节中介绍。大多数针对未标注数据的工作

would fall into one of these two categories. However, there are also many other types of works done in this scope. We will describe these methods in Section 2.2.3.
这两类工程之一。不过,在这一范围内还有许多其他类型的工作。我们将在第 2.2.3 节中介绍这些方法。
2.2.1. Lexicon Expansion. A very basic and simple idea to build a classifier for unannotated data is to use a lexicon of words. A lexicon is a dictionary of words, each word associated with a score showing its degree of polarity. If it is developed for emotionmining purposes, then it may show the degree for each of the possible emotions. On classification time, polarity scores of each word contained in a test sample are fetched and processed in order to predict the polarity of the whole text. The processing of these scores could be done in different ways, including summing up, taking the average, and so on. This generic solution is called a "lexical-based" method. Currently, existing lexicons can be used for this purpose; however, to have higher performance, one may need to create his/her own lexicon of words suitable for the domain in question. Since manually building a lexicon is a tedious and time-consuming task, automatic solutions, called "lexicon expansion" methods, are suggested. Researchers apply different methods for automatic creation of the lexicon from the information lying in the data. This type of method, which expands the lexicon based on the information in the corpus, is called "corpus-based lexicon expansion." The first works belong to Hatzivassiloglou and McKeown [1997], Hatzivassiloglou and Wiebe [2000], Pang et al. [2002], and Yu and Hatzivassiloglou [2003]. They approached the problem by making simple assumptions about the occurrences of words. For instance, Pang et al. [2002] assumed that words present near the word "excellent" could be counted as positive while words adjacent to the word "poor" can be negative. In general, the potential words, whom the lexicon expansion is initiated with, are called "seed words."
2.2.1.词库扩展。为无标注数据建立分类器的一个非常基本和简单的方法是使用词库。词库是由单词组成的词典,每个单词都与显示其极性程度的分数相关联。如果词库是为情感挖掘目的而开发的,那么它可以显示每种可能情感的极性程度。在分类时,会获取并处理测试样本中每个单词的极性得分,以预测整个文本的极性。处理这些分数的方法多种多样,包括求和、取平均值等。这种通用解决方案被称为 "基于词库 "的方法。目前,现有的词库可用于此目的;但是,为了获得更高的性能,人们可能需要创建自己的适合相关领域的词库。由于手动创建词库是一项繁琐耗时的任务,因此有人提出了自动解决方案,即 "词库扩展 "方法。研究人员采用不同的方法从数据信息中自动创建词库。这种根据语料库信息扩充词典的方法被称为 "基于语料库的词典扩充"。最早的研究属于 Hatzivassiloglou 和 McKeown [1997]、Hatzivassiloglou 和 Wiebe [2000]、Pang 等人 [2002] 以及 Yu 和 Hatzivassiloglou [2003]。他们通过对词语出现情况的简单假设来解决这个问题。例如,Pang 等人[2002]假定 "优秀 "一词附近出现的词可以算作正面词,而 "差 "一词附近出现的词可以算作负面词。一般来说,词典扩充开始时的潜在词被称为 "种子词"。
Further attempts to create a useful lexicon were concerned with clustering of words or phrases in sentiment clusters including the works of Andreevskaia and Bergler [2006], Esuli and Sebastiani [2005, 2006a, 2006b], Finn and Kushmerick [2006], Takamura et al. [2007], and Kaji and Kitsuregawa [2007]. One of the good attempts in this set of works was the work of Hatzivassiloglou and Wiebe [2000]. They created a lexicon by using "opposition constraints" such as "but" and "and" between pairs of words and thereafter clustered the words to two partitions.
Andreevskaia 和 Bergler[2006]、Esuli 和 Sebastiani[2005,2006a,2006b]、Finn 和 Kushmerick[2006]、Takamura 等人[2007]以及 Kaji 和 Kitsuregawa[2007]等人的工作涉及将单词或短语聚类为情感聚类。Hatzivassiloglou 和 Wiebe[2000]的工作是这一系列工作中一个很好的尝试。他们通过在词对之间使用 "但是 "和 "和 "等 "对立限制 "来创建词库,然后将词聚类成两个分区。
After finding the clusters of the words and in order to assign sentiment orientation (or degree of polarity) to them, different techniques have been proposed. For instance, Hatzivassiloglou and Wiebe [2000] simply assume that the words with more frequency seem to be positive.
在找到词群后,为了给它们分配情感取向(或极性程度),人们提出了不同的技术。例如,Hatzivassiloglou 和 Wiebe [2000] 简单地假定频率越高的词似乎越积极。
Another popular technique is to have a set of seed words with their polarity and then to assign the polarity of new words with respect to their relationship to the seed words. In other words, polarity of the new words are assigned by propagating the polarity of seed words (based on the clustering results) such as the work of Andreevskaia and Bergler [2006], Gamon and Aue [2005], Esuli and Sebastiani [2005, 2006a], and Kamps et al. [2004]. This category of methods is called "dictionary-based lexicon expansion."
另一种流行的技术是拥有一组带有极性的种子词,然后根据新词与种子词的关系来分配新词的极性。换句话说,新词的极性是通过传播种子词的极性(基于聚类结果)来分配的,例如 Andreevskaia 和 Bergler [2006]、Gamon 和 Aue [2005]、Esuli 和 Sebastiani [2005, 2006a] 以及 Kamps 等人 [2004] 的工作。这类方法被称为 "基于词典的词典扩充"。
It is worth pointing out that most of the mentioned methods try to find a "prior polarity" of words. The "prior polarity" of a word is the polarity that it invokes no matter what context that word is occurring in, while "contextual polarity" is the polarity of the word with respect to the context. For instance, since the word "security" bears a positive polarity in general, we can assume a positive prior polarity for it. However, if it occurs inside a sentence like "There are three living former Secretaries of Homeland Security," then it does not infer any positive or negative polarity, since it is part of a referring expression. Therefore it has a neutral "contextual polarity." Prior polarity should be further applied to determine the "contextual polarity" of the words with respect to the concept and domain such as in the work of Wilson et al. [2005b].
值得指出的是,上述大多数方法都试图找到词语的 "先验极性"。一个词的 "先验极性 "是指无论该词出现在什么语境中,它都会引起的极性,而 "语境极性 "则是该词相对于语境的极性。例如,由于 "security"(安全)一词在一般情况下具有正极性,因此我们可以假设它具有正的先验极性。但是,如果它出现在 "有三位在世的前国土安全部部长 "这样的句子中,那么它就不会推断出任何正极性或负极性,因为它是指代表达的一部分。因此,它具有中性的 "语境极性"。应进一步应用先验极性来确定词语在概念和领域方面的 "上下文极性",如 Wilson 等人的研究[2005b]。
2.2.2. Domain Adaptation. One idea to produce a generic classification method that is adaptable to any kind of data on any domain and extremely useful for unannotated data is training a classifier over a labeled dataset from one domain or topic, called the "source," and use it to label the unlabeled data from another domain, called the "target." However, results of doing so in various domains is shown to be unsatisfactory [Blitzer et al. 2007]. This is expected, since the keywords and phrases used in one domain may differ totally from the keywords in another one. Furthermore, one word in a domain may bear a different sentiment from what it does in another domain. Therefore, adapting the classifier trained over the source to be useful for the target is an essential step. This procedure is called "domain adaptation."
2.2.2.领域适应。要产生一种通用的分类方法,以适应任何领域的任何类型的数据,并且对未标注数据非常有用,有一种想法是在一个领域或主题(称为 "源")的已标注数据集上训练分类器,然后用它来标注另一个领域(称为 "目标")的未标注数据。然而,在不同领域这样做的结果并不令人满意(Blitzer 等人,2007 年)。这在意料之中,因为一个领域中使用的关键词和短语可能与另一个领域中的关键词完全不同。此外,一个领域中的一个词在另一个领域中可能会有不同的情感。因此,对源词训练的分类器进行调整,使其适用于目标词,是一个必不可少的步骤。这一过程被称为 "域适应"。
According to Jiang and Zhai [2007], domain adaptation is considered in two distinct attitudes, namely "labeling adaptation" and "instance adaptation." In labeling adaptation, the labeling function is adapted, since some features (words in opinion mining) may differ in polarity between source and target domains. In instance adaptation, the probability function of features is adjusted; for instance, the changes of word frequency from one domain to another one are modeled.
根据 Jiang 和 Zhai[2007]的观点,领域适应分为两种不同的态度,即 "标签适应 "和 "实例适应"。在标签适应中,由于某些特征(舆情挖掘中的词语)在源域和目标域之间可能存在极性差异,因此要对标签函数进行调整。在实例适配中,对特征的概率函数进行调整;例如,对词频从一个域到另一个域的变化进行建模。
Early attempts to approach the problem relates to the work of Aue and Gamon [2005], in which they evaluate the performance of four rudimentary approaches to somehow adapt a classifier to be useful for the target domain. Those approaches include the following: training over all possible domains, limiting features to those observed in the target domain, ensemble of classifiers, and using a small set of labeled in-domain data.
Aue 和 Gamon [2005]的研究是解决这一问题的早期尝试,他们在研究中评估了四种基本方法的性能,以调整分类器,使其对目标领域有用。这些方法包括:在所有可能的领域中进行训练、将特征限制在目标领域中观察到的特征、分类器集合以及使用一小部分标注的领域内数据。
Further simplistic attempts are the work of Yang et al. [2006], in which they ranked features of the two labeled datasets by running logistic regression over the sentences and selecting the highly ranked features as the ones that are most common in all domains.
Yang 等人[2006]的研究则是更简单的尝试,他们通过对句子进行逻辑回归,对两个标注数据集的特征进行排序,并选择排序靠前的特征作为所有领域中最常见的特征。
Label transferring is another methods used in some of the previous works for domain adaptation. The basic idea of label transferring is to find the most informative samples of the target domain by means of a classifier that is trained over the source domain and then label those informative instances to train a brand new classifier over them. The first work that exploited this idea is Tan et al. [2007]. Later, the same team improved the performance of their system with selecting "generalizable features" by means of a measure they named "Frequently Co-occurring Entropy." Recently, Li et al. [2013] applied the same idea by finding the most informative instances in the target domain using classifiers with a query by committee strategy.
标签转移是之前一些领域适应性研究中使用的另一种方法。标签转移的基本思想是通过在源领域训练过的分类器找到目标领域中信息量最大的样本,然后给这些信息量最大的实例贴上标签,用它们来训练一个全新的分类器。Tan 等人[2007]的第一项工作就利用了这一想法。后来,该团队通过一种名为 "频繁共现熵 "的测量方法,选择 "可通用特征",从而提高了系统的性能。最近,Li 等人[2013]应用了相同的想法,通过使用分类器和委员会查询策略,找到目标领域中信息量最大的实例。
A very common technique, used in different schemes in previous works, is to cluster the features in every domain into two groups. The first group belongs to features that, regardless of the domain, happen frequently, called "domain independent." The second group, called "domain-specific" features, are common just inside their belonging domain. The reason to do such a clustering is to somehow align domain-specific features of the source domain to those of the target domain and then adapt the trained classifier in the source domain.
一种非常常见的技术是将每个领域的特征分为两组,并在以前的工作中以不同的方案加以使用。第一组属于无论在哪个领域都会经常出现的特征,称为 "独立领域"。第二组被称为 "特定领域 "特征,这些特征只在其所属领域内常见。之所以要进行这样的聚类,是为了以某种方式将源域的特定域特征与目标域的特征相统一,然后在源域中调整训练有素的分类器。
Based on the explanation above two steps should be followed:
Clustering features: There are methods suggested to distinguish the two types of features. The idea to recognize domain-independent features is to find features that occur more than a threshold in any domain. To find domain-specific ones, the degree of dependency of each feature to each domain should be calculated. In information theory, this can be done by using "mutual information" between the feature and domain.
聚类特征:有一些方法可以区分这两类特征。识别与领域无关的特征的方法是找到在任何领域中出现次数超过阈值的特征。要找到特定领域的特征,则应计算每个特征对每个领域的依赖程度。在信息论中,这可以通过使用特征与域之间的 "互信息 "来实现。
Alignment: Alignment is a step in which each domain-specific feature in the target domain is mapped to one or more domain-specific features in the source domain. In the literature, this is done in various ways. In the first attempts, Blitzer et al. [2007] approached the problem by using an algorithm, called "structural correspondence learning" (SCL). SCL tries to find the domain-independent features (pivot features) as the
对齐:对齐是将目标域中的每个特定域特征映射到源域中的一个或多个特定域特征的步骤。在文献中,这种方法有多种。在最初的尝试中,Blitzer 等人[2007]使用一种名为 "结构对应学习"(SCL)的算法来解决这一问题。SCL 试图找到与领域无关的特征(枢轴特征)作为

most frequent ones and finds the correspondence model between pivot features and all other features by training linear pivot trainers.
Li et al. [2009] try to approach the alignment problem by using non-negative matrix tri-factorization of term-document matrix. Basically, they factorize the term-document matrix in the source domain and then, by means of a matrix (that expresses if each of the words is occurring in both domains or not), estimate the factors of the termdocument matrix in the target domain.
Li 等人[2009]试图利用术语-文档矩阵的非负矩阵三因子化来解决对齐问题。基本上,他们对源域中的术语-文档矩阵进行因子化,然后通过一个矩阵(表示每个词是否在两个域中都出现)来估计目标域中术语-文档矩阵的因子。
Pan et al. [2010] aimed to cluster domain-specific words in both domains by means of a spectral feature alignment algorithm. This work is promising to exploit all the relationships between the domain-specific and domain-independent words in spite of SCL. Basically, they create a bipartite graph of features that consists of two clusters of domain-specific and domain-independent features. Then, if there exists two domain specific features from two domains that have a lot of common domain-independent features, they align them to be in correspondence to each other.
Pan 等人[2010]旨在通过光谱特征对齐算法对两个领域中的特定领域词进行聚类。尽管存在 SCL,但这项工作有望利用特定领域词和独立领域词之间的所有关系。基本上,他们创建了一个双方特征图,该图由两个特定领域和独立于领域的特征群组成。然后,如果存在来自两个领域的两个特定领域特征,而这两个领域又有很多共同的独立于领域的特征,那么他们就会将这两个特征对齐,使其相互对应。
In addition to the clustering-alignment method, there exists another group of solutions to the problem of adaptation that is based on feature selection in both of the source and target domains. This approach tries to find a feature space in which the gap between the distribution of source and target domains is minimum, comparing to other spaces. Features of both domains are transferred to this new feature space, and then a classifier over the source domain in the new feature set is trained. This classifier can be guaranteed to be working with higher performance over the target domain.
2.2.3. Other Methods. Some other methods appropriate for unannotated data include, but are not limited to, the following.
Bootstrapping: The general idea is to use an initial pre-trained classifier on another dataset to label the target dataset and then use this newly labeled dataset to train a new classifier. Kaji and Kitsuregawa [2006] use this method to label a set of HyperText Markup Language (HTML) documents with the positive / negative polarities.
引导:总体思路是使用在另一个数据集上预先训练好的初始分类器来标注目标数据集,然后使用这个新标注的数据集来训练新的分类器。Kaji 和 Kitsuregawa [2006] 使用这种方法对一组超文本标记语言(HTML)文档进行正负极标注。
Belief network modelling: One of the recent usages of belief networks is on training a model for the task of sentiment classification. Lin and He [2009] add a layer of sentiment to the structure of a famous probabilistic document model called "Latent Dirichlet Allocation" (LDA) [Blei et al. 2003] to find the polarity of words inside a set of documents with respect to each topic.
信念网络建模:信念网络的最新应用之一是为情感分类任务训练模型。Lin和He[2009]在著名的概率文档模型 "潜在德里希特分配"(LDA)[Blei等人,2003]的结构中加入了情感层,以找出文档集中与每个主题相关的词的极性。
Combining lexical and machine-learning methods: Lexical and learning methods can be combined to compensate the disadvantages and drawbacks of each other. In order to optimize the performance of an initially trained classifier (over a different domain), Qiu et al. [2009] use a lexicon-based classifier in which in each step first the lexical classifier labels the data and then the learning classifier is trained over the labeled dataset. Operations continue until results of the two datasets have the minimum distance. In another work, Prabowo and Thelwall [2009] try to build a semi-supervised hybrid classifier by using both rule-based classifiers and SVM classification.
结合词法和机器学习方法:词法和学习方法可以相互结合,以弥补各自的缺点和不足。为了优化最初训练的分类器(在不同领域)的性能,Qiu 等人[2009]使用了一种基于词库的分类器,在每一步中,首先由词库分类器对数据进行标注,然后在标注的数据集上训练学习分类器。操作一直持续到两个数据集的结果距离最小为止。在另一项研究中,Prabowo 和 Thelwall [2009] 尝试同时使用基于规则的分类器和 SVM 分类器来构建半监督混合分类器。
There are other works in which the task of classification was not completely based on the raw words. For instance, Hu et al. [2013] use emoticons to find the sentiment of a given comment in social media.
还有一些作品的分类任务并不完全基于原始单词。例如,Hu 等人[2013] 使用表情符号来查找社交媒体中给定评论的情感。
Research and analysis of the polarity classification methodologies requires resources such as lexicons and annotated datasets. One might need to generate his/her own resources by manually labelling them. Annotating sentiment of the textual data can be a tedious and time-consuming task for an individual. Also, since sentiment of a text is a subjective matter and is interpreted differently among various audiences, it is necessary to have more than one annotator to incorporate multiple perspectives in the
Table I. Summary of Polarity-Related Lexicons
表 I.与极性有关的词条摘要
Name Author Year
Set of polarities 极性集
Harvard General Inquirer
P. Stone 1968 11,790 positive, neutral and negative
Opinion Lexicon 意见词典 B. Liu 2005 6,786 positive, negative 正、负
MPQA T. Wilson 2005 8,222 positive, neutral, negative and both
WPARD D. A. Medler 2005 1,400 positive, negative 正、负
SentiWordNet 3.0 S. Baccianella S.Baccianella 2010 155,287 degree of polarity 极性
NRC S. M. Mohammad S.M. Mohammad various sizes 各种尺寸 positive, negative 正、负
annotation. Although crowdsourcing tools are an option for data annotation, utilizing them might lead to a poorly annotated resource, since the annotators contributing in these tools are mostly regular people with no knowledge in areas such as psychology, linguistics, and sociology.
The challenging nature of the sentiment annotation encourages most of the researchers to take advantage of currently existing resources. Even if there exists an annotated dataset for the domain of the research, it is still time consuming to find it. Here we introduce some of the most well-known lexicons and datasets for polarity mining. Useful resources for emotion mining are explained in Section 6. Note that getting to know the process for the creation of these resources helps if one desires to build his/her own lexicon or dataset.
情感注释的挑战性促使大多数研究人员利用现有资源。即使存在研究领域的注释数据集,要找到它仍然很费时间。在此,我们将介绍一些最著名的极性挖掘词典和数据集。情感挖掘的有用资源将在第 6 节中介绍。需要注意的是,了解这些资源的创建过程有助于建立自己的词典或数据集。

3.1. Lexicons 3.1.词典

There are many publicly available lexicons that are results of lexicon creation and expansion of the previous sentiment analysis works. Among these lexicons, the following ones are known to be the most frequently used and effective in the literature (a summary for the following lexicons can be seen in Table I).
有许多公开可用的词库,它们是以往情感分析工作的词库创建和扩展成果。在这些词库中,以下词库是已知文献中最常用和最有效的(以下词库的摘要见表 I)。
3.1.1. Harvard General Inquirer. This lexicon is the result of one of the first attempts [Stone et al. 1968] to compile a list of words for sentiment analysis. The lexicon contains syntactic, semantic, and pragmatic information of its words. Among the information provided for each word, the one that is of interest is "positive" and "negative." The lexicon includes 11,790 words. The score of each word in this lexicon would be 1,0 , or -1 , meaning that the word is positive, neutral, or negative, respectively.
3.1.1.哈佛综合问询者。该词典 是最早尝试编制情感分析词表的成果之一[Stone et al. 1968]。该词典包含词的句法、语义和语用信息。在为每个词提供的信息中,"积极 "和 "消极 "是人们感兴趣的信息。词库包括 11,790 个单词。每个词在该词典中的得分分别为 1、0 或-1,表示该词是褒义词、中性词或贬义词。
3.1.2. Opinion Lexicon. This lexicon, which is an outcome of Bing Liu's research in sentiment analysis [Hu and Liu 2004; Liu et al. 2005], consists of 6,786 words among which 2,009 of them are positive and the rest are negative. The corpus from which they have extracted the words includes customers' opinions about various features of products. They have extracted the words by finding sentences that include a frequent feature of a product and pulling adjectives from those sentences. Afterwards, they have separated those extracted words into two clusters of positive and negative ones based on the score of their synonyms and antonyms using a dictionary of words. The score of each word is defined in a similar way to the scoring of the Harvard General Inquirer lexicon.
3.1.2.意见词典。该词典( )是刘兵在情感分析方面的研究成果[Hu 和 Liu 2004;Liu et al. 2005],由 6,786 个词组成,其中 2,009 个词为正面词,其余为负面词。他们从中提取的语料包括客户对产品各种特性的意见。他们提取词语的方法是找到包含产品常见特征的句子,并从这些句子中提取形容词。然后,他们根据同义词和反义词的得分,利用词语词典将这些提取的词语分成正面和负面两个词组。每个词的得分定义与《哈佛综合问询者词典》的得分定义类似。
3.1.3. Multi-perspective Question Answering (MPQA). The "MPQA" lexicon [Wilson et al. 2005b] consists of 8,222 words, each of which are provided with a set of information including how subjective the word is; how strong its subjectivity is; the prior polarity of the word, which can be positive, negative, both, or neutral; and whether the word is stemmed.
3.1.3.多视角问题解答(MPQA)。MPQA "词库 [Wilson et al. 2005b] 由 8,222 个单词组成,每个单词都有一组信息,包括单词的主观性有多强;主观性有多强;单词的先验极性(可以是正极性、负极性、两极性或中性);以及单词是否有词干。
This lexicon is built on top of a subjectivity lexicon that resulted from the works of the same team [Wilson et al. 2005a]. In the first step, annotators are given a set of instructions and an annotating scheme to annotate phrases and words to be positive, negative, both, or neutral. In the second step, they measure the agreement between annotations of two annotators to evaluate the lexicon. Based on their annotation scheme, annotators' decisions depend mostly on the emotion of the sentences inside their corpus. This can make the MPQA lexicon a beneficial lexicon both for emotion classification and polarity classification.
该词库建立在主观性词库的基础上,而主观性词库则是由同一团队的研究成果[Wilson 等人,2005a]建立的。第一步,给注释者提供一套说明和注释方案,让他们对短语和单词进行积极、消极、两者兼有或中立的注释。第二步,他们测量两个注释者的注释之间的一致性,以评估词库。根据他们的注释方案,注释者的决定主要取决于其语料库中句子的情感。这使得 MPQA 词库在情感分类和极性分类方面都很有帮助。
3.1.4. WPARD. Using an online form, Medler et al. [2005] collected information from 342 undergraduate students. Participants were asked to rate how negative or positive were the emotions they associate with each word, using a scale from -6 (very negative) to +6 (very positive). They built the lexicon Wisconsin Perceptual Attribute Rating Database (WPARD) from these data such that each word has a corresponding polarity and a real number showing the strength of that polarity.
3.1.4.WPARD。Medler 等人[2005]使用在线表格收集了 342 名本科生的信息。他们要求受试者用-6(非常消极)到+6(非常积极)的等级来评定他们与每个词相关联的情绪的消极或积极程度。他们根据这些数据建立了词典威斯康星知觉属性评级数据库(WPARD) ,使每个词都有一个相应的极性和一个实数,以显示该极性的强度。
3.1.5. SentiWordNet 3.0. "SentiWordNet 3.0" is a lexical resource provided by Baccianella et al. [2010]. This lexicon is built on top of its previous version, SentiWordNet 1.0. It contains 155,287 words and is provides each word with a decimal signed polarity degree. In a nutshell, their method to create SentiWordNet 3.0 consists of five steps, including starting from a seed set of positive and negative words and applying synonyms and antonyms to expand the lexicon, adding objective words as a new cluster, training a community of ternary classifiers for the glosses of the words, classifying clusters of words with the classifiers, and, finally, running a random walk on the graph of words to make their scores converge to a final state. Recently, efforts have been made to adapt SentiWordNet to other languages. For example, Das and Bandyopadhyay [2010] develop SentiWordNet for three Indian languages (Bengali, Hindi, and Telugu) and Vu and Park [2014] construct a Vietnamese version of SentiWordNet.
3.1.5.SentiWordNet 3.0。"SentiWordNet 3.0 "是由 Baccianella 等人[2010]提供的词汇资源。该词库建立在其上一版本 SentiWordNet 1.0 的基础之上。它包含 155287 个单词,并为每个单词提供了十进制符号极性度。简而言之,他们创建 SentiWordNet 3.0 的方法由五个步骤组成,包括从正词和反词的种子集开始,应用同义词和反义词来扩展词库,添加客观词语作为新的词簇,为词语的词汇训练三元分类器群,用分类器对词簇进行分类,最后,在词语图上运行随机漫步,使它们的分数收敛到最终状态。最近,人们开始努力将 SentiWordNet 适应于其他语言。例如,Das 和 Bandyopadhyay [2010] 为三种印度语言(孟加拉语、印地语和泰卢固语)开发了 SentiWordNet,Vu 和 Park [2014] 构建了越南语版本的 SentiWordNet。
3.1.6. NRC. Starting from 2009 to now, S. M. Mohammad has compiled several wordsentiment lexicons from different corpora, including Twitter and customer reviews of Yelp and Amazon. In some cases, labeling is done manually and in other cases it is done automatically, such as using hashtag of a tweet as its label. For more elaboration on each of them, one can refer to Svetlana Kiritchenko and Mohammad [2014], Mohammad et al. [2013], Zhu et al. [2014], and Kiritchenko et al. [2014].
3.1.6.NRC.从 2009 年至今,S. M. Mohammad 已从不同的语料库(包括 Twitter 以及 Yelp 和亚马逊的客户评论)中编制了多个词情感词典 。在某些情况下,标注是人工完成的,而在另一些情况下,标注则是自动完成的,例如使用推文的标签作为标注。有关其中每种方法的更多详情,可参阅 Svetlana Kiritchenko and Mohammad [2014]、Mohammad 等人[2013]、Zhu 等人[2014]和 Kiritchenko 等人[2014]。

3.2. Datasets 3.2.数据集

Compared to other areas of text categorization, including emotion classification, polarity classification benefits from a larger number of well-annotated datasets. Because of this, here we only point to the benchmark datasets that have been very commonly exploited in the literature for experiments.
3.2.1. Amazon. "Amazon" is the result of the work of Blitzer et al. [2007]. It is a dataset of product reviews constructed from the Amazon website. It includes four different domains of DVDs, books, electronics, and kitchen items, each of which has 2,000 reviews. The reviews of each domain are half positive and half negative, making this dataset balanced. Each instance in this dataset includes detailed information of a review consisting of its rating, which is from 0 to 5 stars, review title and date, and the review content. They have crawled the data from the Amazon website, annotated the reviews such that ratings higher than 3 stars are positive and those with ratings lower than 3 stars are negative, and discarded the ones with 3 stars, since those reviews are more likely to have ambiguous sentiments. In addition to the labeled data, this dataset includes 3,685 unlabeled instances in the DVD domain and 5,945 unlabeled instances of kitchen reviews. This part of the dataset is created by selecting an equal number of positive and negative reviews from a set of labeled data and discarding the labels.
3.2.1.亚马逊"亚马逊 "是 Blitzer 等人[2007]的研究成果。它是从亚马逊网站上构建的一个产品评论数据集。它包括 DVD、书籍、电子产品和厨房用品四个不同的领域,每个领域都有 2,000 条评论。每个领域的评论一半是正面的,一半是负面的,因此这个数据集是平衡的。该数据集中的每个实例都包含评论的详细信息,包括评级(从 0 星到 5 星)、评论标题和日期以及评论内容。他们从亚马逊网站上抓取数据,对评论进行注释,使评级高于 3 星的评论为正面评论,评级低于 3 星的评论为负面评论,并舍弃了评级为 3 星的评论,因为这些评论更有可能带有模棱两可的情绪。除了标注数据外,该数据集还包括 3,685 个 DVD 领域的未标注实例和 5,945 个厨房评论的未标注实例。这部分数据集是通过从标签数据集中选取相同数量的正面和负面评论,然后去掉标签而创建的。
3.2.2. Movie Datasets. Various versions of datasets are extracted from the movie reviews of famous online movie databases, all of which are built by Pang et al. Here is a summary of each version:
3.2.2.电影数据集。各种版本的数据集 都是从著名的在线电影数据库的电影评论中提取的,所有这些数据集都是由 Pang 等人建立的。以下是每个版本的摘要:
-Pool of HTML files: These data consist of 27,886 HTML files that are unprocessed and unlabeled. Files consist of reviews crawled from an online database called "Internet Movie Database" (IMDB). This is the raw version of the next labeled one (Polarity dataset).
-HTML 文件池:这些数据包括 27,886 个未经处理和未标记的 HTML 文件。文件包括从名为 "互联网电影数据库"(IMDB)的在线数据库中抓取的评论。这是下一个标签数据集(极性数据集)的原始版本。
-Polarity dataset: This version of the data includes four different subversions. Subversions 0.9 and 1 [Pang et al. 2002] consist of 700 positive and 700 negative processed reviews, and subversion 1.1 is slightly modified by removing a few nonEnglish/incomplete reviews and correcting some mislabeled reviews. Finally, the last subversion 2 consists of 1,000 reviews for each class of polarities [Pang and Lee 2004]. Since not all the reviews in the raw version have the same format of rating, labeling them is done differently based on the format of rating. First, only those reviews whose author has explicitly declared the rating are classified. With a 5-star system, reviews with 3.5 stars and up are labeled as positive and reviews below or equal to 2.5 are counted as negatives. With a four-star system, reviews higher or equal to 3 stars are labeled as positive, and the ones with 1.5 stars or lower are labeled as negative. Finally, with a letter grade system, B or above is considered positive and or below is considered negative.
-极性数据集:这一版本的数据包括四个不同的子版本。子版本 0.9 和 1 [Pang 等人,2002 年] 包括 700 条经过处理的正面评论和 700 条经过处理的负面评论,而子版本 1.1 则略有改动,删除了一些非英文/不完整的评论,并纠正了一些错误标签的评论。最后,最后一个子版本 2 包含每类极性的 1,000 条评论[Pang 和 Lee 2004]。由于原始版本中并非所有评论都有相同的评分格式,因此根据评分格式的不同,对这些评论进行了不同的标注。首先,只有那些作者明确声明了评分的评论才会被分类。在五星系统中,3.5 星及以上的评论被标记为正面评论,低于或等于 2.5 星的评论被计为负面评论。在四星系统中,高于或等于 3 星的评论被标记为正面,而 1.5 星或以下的评论被标记为负面。最后,在字母等级系统中,B 级或以上被视为正面评价, 或以下被视为负面评价。
-Sentence polarity dataset: This version of the data includes 5,331 positive and 5,331 negative processed sentences and snippets provided by Pang and Lee [2005]. All of the instances are downloaded from a movie review database called "rottentomatoes," which classifies reviews either as fresh (meaning positive) or as rotten (meaning negative).
-句子极性数据集:这一版本的数据包括由 Pang 和 Lee [2005] 提供的 5,331 个经过处理的正面句子和 5,331 个经过处理的负面句子和片段。所有实例都是从一个名为 "rottentomatoes "的电影评论数据库中下载的,该数据库将评论分为新鲜(意为正面)和腐烂(意为负面)两类。
3.2.3. Blogs. This dataset, which is provided by Melville et al. [2009], includes two different sets of blog posts, one of which is concerned with technology blogs, and the other one is related to political blogs. The first set, named "lotus blogs," is a set of posts corresponding to IBM Lotus collaborative software gathered from 14 blogs, 4 of which have posted mostly negative comments about the product, and the others have provided positive posts. The data were provided by downloading either the latest posts of each blogger's Rich Site Summary (RSS) feed or the archived posts of that blog. Afterwards they extracted text from those parts of the HTML files in which the ratio of tags to words is above a minimal threshold. Then all the posts were read and labeled manually to be positive, negative, neutral, or irrelevant. There exist 34 positive and 111 negative instances in this set.
3.2.3.博客。该数据集由 Melville 等人[2009]提供,包括两组不同的博客帖子,其中一组与技术博客有关,另一组与政治博客有关。第一组名为 "lotus blogs",是从 14 个博客中收集到的与 IBM Lotus 协作软件相对应的博文,其中 4 个博客发布的大多是关于该产品的负面评论,其他博客发布的都是正面博文。数据是通过下载每个博主的富站点摘要(RSS)馈送的最新帖子或该博客的存档帖子提供的。然后,他们从 HTML 文件中标签与单词比例超过最小阈值的部分提取文本。然后阅读所有帖子,并手动标记为正面、负面、中性或无关。在这一组中,有 34 个正面实例和 111 个负面实例。
The second part of this dataset consists of political posts regarding two candidates of the United States presidential election in 2008, namely "Barak Obama" and "Hillary Clinton." The posts were taken from a set of 16,741 blogs, filtered based on whether they have the words "Obama" and "Clinton," and randomly selected for manual labeling.
该数据集的第二部分由有关 2008 年美国总统大选两位候选人(即 "巴拉克-奥巴马 "和 "希拉里-克林顿")的政治帖子组成。这些帖子来自 16741 个博客,根据是否包含 "奥巴马 "和 "克林顿 "字样进行过滤,然后随机抽取进行人工标注。
Table II. Summary of Polarity-Related Datasets
表 II.极性相关数据集摘要
Name Author Year Size Type of Data
2004 29,419 processed labeled IMDB movie reviews (document level)
经过处理的贴有标签的 IMDB 电影评论(文档级)
Movie B. Pang 2004 2,000 raw unlabeled IMDB movie reviews (document level)
未标记的原始 IMDB 电影评论(文档级)
2005 10,662 processed rottentomatoes movie reviews (sentence level)
Amazon J. Blitzer 2007 8000 reviews of products 产品评论
Blogs P. Melville 2009 252 product review and political posts
Based on Melville et al. [2009], labeling political posts is much more difficult than that of product reviews, because posts are more emotional, mostly mentioning implicit comments and judgments about the candidate, and may apply cultural references to make a point. Therefore they have labeled those posts that have explicitly mentioned an opinion about one of the two candidates as positive or negative. Hence there are no neutral or irrelevant posts in this set. The Politic dataset includes 49 positive and 58 negative posts.
根据 Melville 等人[2009]的研究,给政治类帖子贴标签要比给产品评论贴标签困难得多,因为帖子更加情绪化,大多提及对候选人的隐性评论和判断,并可能引用文化因素来表达观点。因此,他们将那些明确提及对两位候选人之一的看法的帖子标记为正面或负面。因此,该数据集中没有中性或不相关的帖子。Politic 数据集包括 49 篇正面帖子和 58 篇负面帖子。


In Sections 2 and 3, we discussed methods and resources for polarity classification that can almost equally be effective for emotion classification. In any emotion-related research, the first question to be answered is what emotion really is. In this section, we introduce some theories that define emotion and suggest some sets of basic emotions. While most of the research on emotions in computer science uses the terms emotion, feeling, mood, and affect interchangeably, these terms do not share the same exact meaning. According to Fox [2008], in affective neuroscience, the terms are defined as follows:
在第 2 节和第 3 节中,我们讨论了极性分类的方法和资源,这些方法和资源对情感分类几乎同样有效。在任何与情感相关的研究中,首先要回答的问题是情感到底是什么。在本节中,我们将介绍一些定义情感的理论,并提出几组基本情感。虽然计算机科学领域对情绪的研究大多交替使用情绪、感觉、心境和情感等术语,但这些术语的确切含义并不相同。根据福克斯[2008]的说法,在情感神经科学中,这些术语的定义如下:
-Emotion: discrete and consistent responses to internal or external events that have a particular significance for the organism; emotion has short-term duration.
-Feeling: a subjective representation of emotions, private to the individual experiencing them; similarly to emotion, it has short-term duration
-Mood: a diffuse affective state that compared to emotion is usually less intense but with longer duration
-Affect: an encompassing term used to describe the topics of emotion, feelings, and moods together.
Even with having clear definitions of these terms, there are still some controversial issues regarding whether some particular human states are classified as an emotion. For instance, thankfulness or gratitude is considered an emotion by some theorists, while others consider actions such as greeting, thanking, and congratulating as communicative functions.
Scientific studies on the classification of human emotions date back to the 1960s. There are two prevalent theories in this field: discrete emotion theory and dimensional model. Discrete emotion theory states that different emotions arise from separate neural systems, while the dimensional model states that a common and interconnected neurophysiological system is responsible for all affective states. This model defines emotions according to one or more dimensions where usually one of them relates to intensity of emotions.
关于人类情绪分类的科学研究可追溯到 20 世纪 60 年代。在这一领域有两种流行的理论:离散情绪理论和维度模型。离散情绪理论认为,不同的情绪产生于不同的神经系统,而维度模型则认为,一个共同的、相互关联的神经生理系统负责所有的情绪状态。该模型根据一个或多个维度来定义情绪,其中一个维度通常与情绪强度有关。
Basic emotions refer to those that do not have any other emotion as constituent parts. In addition, they can be recognized by humans all over the world regardless of their race, culture, and language. Theorists of both sides have proposed sets of emotions that tend to be basic ones. Table III shows some of the frequently referenced models of basic emotions. Ekman, one of the earliest emotion theorists, suggested that those certain emotions that are universally recognized form the set of basic emotions. He later
基本情绪指的是那些没有任何其他情绪作为构成部分的情绪。此外,它们可以被全世界的人类所认识,而不受种族、文化和语言的限制。正反两方面的理论家都提出了倾向于基本情绪的情绪集合。表 III 列出了一些经常被引用的基本情绪模型。埃克曼是最早的情绪理论家之一,他认为那些被普遍认可的特定情绪构成了基本情绪的集合。他后来
Table III. Different Models of Basic Emotions Proposed by Theorists
表 III.理论家提出的基本情感的不同模式
Theorist Year Basic Emotions 基本情绪 Type
Ekman 1972 anger, disgust, fear, joy, sadness, surprise
Plutchik 1986 anger, anticipation, disgust, fear, joy, sadness, surprise, trust
Shaver 1987 anger, fear, joy, love, sadness, surprise
Lovheim 2011 anger, disgust, distress, fear, joy, interest, shame, surprise
Fig. 2. The illustration of four frequently used emotion models.
图 2.四种常用情感模型的示意图。
expanded his set of emotions by adding 12 new positive and negative emotions [Ekman 1992]. The dimensional model of Plutchik and Kellerman [1986] arranges emotions on four bipolar axes: joy vs. sadness, anger vs. fear, trust vs. disgust, and surprise vs. anticipation. The fact that some of these emotions are actually opposite of each other is trivial in cases like joy vs. sadness, but it is not intuitive enough in other cases, such as anger vs. fear. Shaver et al. [1987] model emotions in a tree structure such that basic emotions are the main branches and each branch has its own categorization. Lövheim [2012] also suggests a dimensional model; however, his model differs from Plutchik's. Lövheim believes that three hormones, serotonin, dopamine, and noradrenaline, form three dimensions of a cube, where each basic emotion is placed on one of the corners.
艾克曼(Ekman)1992 年又增加了 12 种新的积极和消极情绪,从而扩展了他的情绪模型。普拉奇克和凯勒曼(Plutchik and Kellerman)[1986]的维度模型将情绪排列在四个两极轴上:快乐与悲伤、愤怒与恐惧、信任与厌恶,以及惊喜与期待。其中一些情绪实际上是相反的,这在喜悦与悲伤等情况下微不足道,但在愤怒与恐惧等其他情况下就不够直观了。Shaver 等人[1987]以树状结构建立情绪模型,基本情绪是主要分支,每个分支都有自己的分类。洛夫海姆(Lövheim)[2012]也提出了一个维度模型,但他的模型与普鲁奇克的不同。Lövheim 认为,血清素、多巴胺和去甲肾上腺素这三种激素构成了一个立方体的三个维度,而每种基本情绪都被置于其中的一个角上。
Figure 2 illustrates the four explained models together so one can compare them. The Plutchik's bipolar division of emotions is shown using the sign . The positiveness and/or negativeness of emotions are also shown using the signs + and - , respectively. Emotions such as interest, surprise, and anticipation can be both positive and negative, depending on the situation in which they are felt. Alm and Sproat [2005] even divide surprise to two separate emotions of positively surprise and negatively surprise. Table IV
图 2 将四种解释的模式放在一起进行比较。普拉奇克对情绪的两极划分用符号 表示。情绪的积极性和/或消极性也分别用 "+"和"-"来表示。兴趣、惊喜和期待等情绪既可以是积极的,也可以是消极的,这取决于它们是在什么情况下产生的。Alm 和 Sproat [2005] 甚至把惊喜分为积极惊喜和消极惊喜两种不同的情绪。表四
Table IV. Commonality of Emotion Models
表 IV.情绪模型的共性
Emotion Ekman Plutchik Shaver and Parrott 沙弗和帕罗特 Lovheim
shows another illustrations of commonality of these emotion models. According to both Figure 2 and Table IV, anger, fear, joy, and surprise are common in all models, but there is no agreement on the rest. One interesting point in all models is that the number of negative emotions outweighs the number of positive ones. While psychologists do not agree on what model describes more accurately the set of basic emotions, the model suggested by Ekman et al. [1972], with six emotions, is the most widely used in computer science research.
表 IV 显示了这些情绪模型的共性。根据图 2 和表 IV,愤怒、恐惧、喜悦和惊讶在所有模型中都很常见,但其余的情绪模型则不尽相同。在所有模型中,一个有趣的现象是消极情绪的数量多于积极情绪的数量。虽然心理学家们对哪种模型能更准确地描述一组基本情绪并没有达成一致意见,但埃克曼等人[1972]提出的包含六种情绪的模型在计算机科学研究中得到了最广泛的应用。


In this section, we explain the major works on textual emotion mining in the computer science world; however, note that research on emotion has been an interesting topic in many other fields as well. Murphy et al. [2015] studies the use of language to convey emotional experience, and Pennebaker [1997] investigates the effect of expressing emotions on physical and mental health. Recently, interdisciplinary studies among psychology, linguistics, computer science, and other areas has increased. For instance, Russell et al. [2013] is a joint work by a group of anthropologists, linguists, and psychologists to discuss how emotions are conceptualised by people.
在本节中,我们将介绍计算机科学领域有关文本情感挖掘的主要作品;不过,请注意,在许多其他领域,有关情感的研究也一直是一个有趣的话题。Murphy 等人[2015]研究了使用语言传递情感体验的问题,Pennebaker [1997]研究了表达情感对身心健康的影响。最近,心理学、语言学、计算机科学和其他领域之间的跨学科研究有所增加。例如,Russell 等人[2013]的研究是由一群人类学家、语言学家和心理学家共同完成的,旨在讨论人们如何将情绪概念化。
In 1992, Walther [1992] introduced the Social Information Processing (SIP) theory, which states that in order to convey relational information in computer-mediated communications, people use verbal clues instead of nonverbal clues that would have been used in face-to-face environments. Walther et al. [2005] later validated their hypothesis by conducting an experimental study and showed that affinity is expressed equally effectively in both face-to-face and textual styles. In addition, verbal clues carried a larger portion of relational information in communications via a computer medium. This simple theory can be a proof for the validity of a textual emotion-mining research topic.
1992 年,Walther[1992] 提出了社会信息处理(SIP)理论,该理论认为,在以计算机为媒介的交流中,为了传递关系信息,人们会使用语言线索,而不是在面对面环境中使用的非语言线索。Walther 等人[2005]后来通过一项实验研究验证了他们的假设,结果表明,亲和力在面对面交流和文本交流中都能得到同样有效的表达。此外,在通过计算机媒介进行的交流中,语言线索承载了更多的关系信息。这一简单的理论可以证明文本情感挖掘研究课题的有效性。
Since most of the body of research in emotion mining is dedicated to emotion classification, we put more emphasis on this division too; however, note that other directions of this field, introduced in the taxonomy, are also being investigated.
Automatic classification of emotions can be categorized from different aspects, similarly to the categorization of polarity tasks that we did in Section 2 . For example, it can be done in document level vs. sentence level or can use annotated data vs. unannotated data. Although we dedicated separate sections to annotated and unannotated data for polarity tasks, since most of the body of emotion research focuses on sentence level and annotated data, we consolidate the work together.
情绪的自动分类可以从不同方面进行,这与我们在第 2 节中所做的极性分类任务类似。例如,可以在文档级与句子级之间进行分类,也可以使用注释数据与未注释数据。虽然我们将极性任务中的注释数据和未注释数据分别放在不同的章节中讨论,但由于大部分情感研究都集中在句子层面和注释数据上,因此我们将这些工作合并在一起。
In a text environment, emotion analysis can be either from the writer's or from the reader's perspective. The former refers to emotions that the author had when he/she was writing the message, while the latter refers to a user's affective response to being

exposed to feelings evoked by an emotional text. Readers can further be divided into two groups: an individual reader or a group or society of readers, sometimes referred to as social emotion detection. Both writer and reader can feel the same emotion in some cases; however, it is not a general rule. A reader's point of view has attracted less attention in the literature; nevertheless, it has many applications, including helping authors to predict how their work will influence the audience or helping readers to retrieve documents that have content relevant to their desired emotion [Rao et al. 2014]. Examples of social emotion detection can be found in Mishne and De Rijke [2006] and Lei et al. [2014].
情感文本所唤起的情感。读者可进一步分为两类:个体读者或群体或社会读者,有时也称为社会情感检测。在某些情况下,作者和读者可以感受到相同的情感,但这并不是普遍规律。读者的观点在文献中受到的关注较少;然而,它有很多应用,包括帮助作者预测其作品将如何影响受众,或帮助读者检索与他们所期望的情感相关的文档内容[Rao 等人,2014]。Mishne 和 De Rijke [2006] 以及 Lei 等人[2014]的文章中都有社会情感检测的实例。
In some configurations, each sample (a document or a sentence) is assumed to have one single emotion, while sometimes the text can be multi-emotional, which means it can contain several emotions at the same time. An example of this situation is the short document "I was happy that it was my birthday yesterday. I was anticipating my family to throw me a party. however, nobody remembering it made me sad" which shows joy, anticipation, and sadness simultaneously.
在某些配置中,每个样本(文档或句子)被假定为只有一种情绪,而有时文本可能是多情绪的,这意味着它可能同时包含几种情绪。例如,短文 "昨天是我的生日,我很高兴。然而,没有人记得这件事让我很难过",这段话同时表达了喜悦、期待和难过。
Techniques used for polarity classification of both annotated and unannotated data, discussed in Section 2, are all prevalent methods in emotion classification as well. Therefore, we do not replicate them here and instead give a thorough review of existing methods specific to emotion classification with enough elaboration.
第 2 节中讨论的对有注释和无注释数据进行极性分类的技术,也都是情感分类中常用的方法。因此,我们在此不再赘述,而是对现有的情感分类方法进行全面回顾和详细阐述。
Hancock et al. [2007] is one of the works to characterize how users express emotions in text-based systems. Their study on 40 undergraduate male and female students showed that both genders agree more with their conversation partner when they want to convey a positive attitude. They also use 5 times less negative affect terms and use more punctuation marks. On the other hand, those partners who receive the emotional texts judge mostly based on negations and exclamation points. These findings are in line with what "SIP theory" suggests. This study contributes to automatic extraction of emotions from text by providing an insight into the strategies that people employ to convey their emotions.
Hancock 等人[2007]是研究用户如何在基于文本的系统中表达情绪的作品之一。他们对 40 名本科男女学生进行的研究表明,当他们想表达积极态度时,男女学生都会更多地同意对话伙伴的观点。他们使用的消极情绪术语也要少 5 倍,而且使用的标点符号也更多。另一方面,那些收到情感文本的伴侣则主要根据否定词和感叹号来判断。这些发现符合 "SIP 理论 "的观点。这项研究有助于从文本中自动提取情感,深入了解人们在传达情感时所采用的策略。
Kao et al. [2009] is one of the earliest surveys on textual emotion mining. It classifies works into lexical-based (or keyword-based), learning-based, and hybrid methods where hybrids combine detecting keywords, learning patterns, and using other supplementary information. They then suggest a system in which keywords are extracted using a semantic analyzer, and an ontology is designed with the emotion theory of appraisal. These two are combined in a case-based reasoning architecture.
Kao 等人[2009]是最早的文本情感挖掘研究之一。该研究将研究分为基于词法(或基于关键词)的方法、基于学习的方法和混合方法,其中混合方法结合了检测关键词、学习模式和使用其他补充信息。然后,他们提出了一个系统,在这个系统中,使用语义分析器提取关键词,并根据情感评价理论设计本体。这两种方法结合在一个基于案例的推理架构中。
Jain and Kulkarni [2014] give a short survey on emotion-mining research but their review lacks a rational categorization of works. They introduce some Information Retrieval (IR) models that can be used in text research and suggest a system, called "TexEmo," that essentially uses a bag of words with Term Frequency-Inverse Document Frequency (TF-IDF) weighting as features and trains an SVM classifier on them. They do not report any results for this system.
Jain 和 Kulkarni [2014] 对情感挖掘研究进行了简短的调查,但他们的综述缺乏对作品的合理分类。他们介绍了一些可用于文本研究的信息检索(IR)模型,并提出了一个名为 "TexEmo "的系统,该系统主要使用具有词频-反向文档频率(TF-IDF)加权的词袋作为特征,并对其进行 SVM 分类器训练。他们没有报告该系统的任何结果。
Kim et al. [2010] follow lexical-based approaches to evaluate the merit of the "discrete emotion theory" and the "dimensional model," discussed in Section 4. To build a classifier based on the theory of discrete emotions, they use the Wordnet Affect lexicon as well as three-dimensional reduction techniques, namely Latent Semantic Analysis (LSA), Probabilistic LSA, and Non-negative Matrix Factorization. To build a dimensional classifier, they use a normative database of English affective words, called "Affective Norm for English Words," in which each word is rated on the three dimensions of valence, arousal, and dominance. According to their results on Semantic Evaluation (SemEval) 2007, International Survey on Emotion Antecedents and Reactions (ISEAR), and fairy tales datasets, all of which will be introduced in Section 6.2, performance of methods varies on each emotion, and there is no method that performs better than others on all emotions that are under discussion.
Kim 等人[2010]采用基于词法的方法来评估第 4 节中讨论的 "离散情感理论 "和 "维度模型 "的优劣。为了建立基于离散情绪理论的分类器,他们使用了 Wordnet Affect 词库以及三维还原技术,即潜在语义分析(LSA)、概率 LSA 和非负矩阵因式分解。为了建立一个维度分类器,他们使用了一个名为 "英语单词情感规范 "的英语情感词规范数据库,其中每个单词都在情感、唤醒和支配三个维度上进行评分。根据他们对 2007 年语义评估(SemEval)、情感先决条件和反应国际调查(ISEAR)以及童话故事数据集(所有这些数据集都将在第 6.2 节中介绍)的研究结果,各种方法在每种情感上的表现都不尽相同,没有一种方法在讨论的所有情感上的表现都优于其他方法。
Alm et al. [2005] try to identify emotional passages and determine their valence (positive vs. negative). They extract 30 features from their dataset of children's fairy tales, including direct speech (if the sentence is a whole quote), punctuation marks, complete uppercase words, sentence length, range of story progress, and POS. Then, a linear classifier, called "Sparse Network of Winnows," is applied on the data. Although their classification results are unsuccessful, their dataset is reputed and widely used in the field of emotion mining.
Alm 等人[2005]试图识别情感段落,并确定其价值(积极与消极)。他们从儿童童话数据集中提取了 30 个特征,包括直接语句(如果句子是整句引用)、标点符号、完整的大写单词、句子长度、故事进展范围和 POS。然后,一个名为 "Sparse Network of Winnows "的线性分类器被应用于这些数据。虽然他们的分类结果并不成功,但他们的数据集在情感挖掘领域享有盛誉并得到了广泛应用。
Neviarouskaya et al. [2007] construct a rule-based system for emotion recognition, named "Affect Analysis Model" (AAM). They create an affect database that contains emoticons, acronyms, abbreviations, affect words, interjections, and modifiers. Each entry is manually labeled with an emotion and an intensity showing the degree of its affective state. This database is then used in a five-stage system: symbolic cue analysis, syntactical structure analysis, word-level analysis, phrase-level analysis, and, finally, sentence-level analysis. Each stage consists of a set of rules that help identify the emotion relied in the text. An example rule is as follows: "In a compound sentence that independent clauses are connected with comma, 'and', or 'so', the output emotion is equal to the emotion of the clause with maximum intensity." In a later work, Neviarouskaya et al. [2009] added the ability to process sentences of different complexity. To do so, they decompose a sentence to pieces that correspond to lexical units and then apply some extra rules to infer the total emotion of the text based on the emotions of its parts. AAM is claimed to handle informal messages and is tested on a dataset of diarylike blog posts; however, it still has a long way to prove this for other data. In addition, it cannot distinguish among different meanings of words with respect to the context and does not take into account the expression modifiers such as "to death" in the example "I love my ipad to death."
Neviarouskaya 等人[2007]构建了一个基于规则的情感识别系统,名为 "情感分析模型"(AAM)。他们创建了一个情感数据库,其中包含表情符号、缩略语、缩写词、情感词、插入语和修饰语。每个条目都人工标注了情感和强度,以显示其情感状态的程度。然后,该数据库被用于一个五阶段系统:符号线索分析、句法结构分析、词级分析、短语级分析,最后是句子级分析。每个阶段都由一组规则组成,这些规则有助于识别文本中依赖的情感。规则示例如下"在用逗号、'和'或'所以'连接独立分句的复合句中,输出的情感等于强度最大的分句的情感"。在后来的工作中,Neviarouskaya 等人[2009]增加了处理不同复杂程度句子的能力。为此,他们将句子分解成与词汇单元相对应的片段,然后应用一些额外的规则,根据各部分的情感推断出文本的总情感。据称,AAM 可以处理非正式信息,并在一个类似日记的博客文章数据集上进行了测试;但是,对于其他数据来说,要证明这一点还有很长的路要走。此外,它不能根据上下文区分词语的不同含义,也不考虑表达修饰词,例如 "我爱死我的 ipad 了 "中的 "死"。
Chaumartin [2007] proposes another rule-based system, called "University Paris 7 (UPAR7)," specifically for the SemEval 2007 dataset. They use the Stanford syntactic parser to build the dependency graph for each news headline. Then they enrich the Wordnet Affect and SentiWordnet lexicons to use them for rating each word separately and then try to rate the main subject of the whole headline sentence, considering contrasts, accentuations, negations, modals, and so on. UPAR7 ranked as one of the top systems that competed in the SemEval 2007 category of shared task of affective computing.
Chaumartin[2007]提出了另一种基于规则的系统,名为 "巴黎大学7(UPAR7)",专门用于SemEval 2007数据集。他们使用斯坦福句法分析器为每个新闻标题构建依赖关系图。然后,他们丰富了 Wordnet Affect 和 SentiWordnet 词库,利用它们分别对每个词进行评分,然后尝试对整个标题句子的主语进行评分,同时考虑对比、重音、否定、情态等因素。UPAR7 在 SemEval 2007 的情感计算共享任务类别中名列前茅。
Strapparava and Mihalcea [2008] predict emotions of news headlines in an unsupervised manner from the SemEval 2007 dataset. In one experiment, they use the LSA technique as a semantic similarity mechanism. Each document can be represented in an LSA space by summing up the normalized LSA vectors of all the terms contained in it. In another experiment, they train a Naïve Bayes classifier on a collection of LiveJournal blogs as a training set and use this classifier to label their news data. Their results are acceptable compared to three other algorithms that participated in the SemEval 2007 workshop.
Strapparava 和 Mihalcea [2008] 通过 SemEval 2007 数据集以无监督方式预测新闻标题的情感。在一项实验中,他们使用 LSA 技术作为语义相似性机制。通过对文档中包含的所有术语的归一化 LSA 向量求和,可以在 LSA 空间中表示每篇文档。在另一项实验中,他们以一组 LiveJournal 博客作为训练集,训练 Naïve Bayes 分类器,并使用该分类器对新闻数据进行标注。与参加 SemEval 2007 研讨会的其他三种算法相比,他们的结果是可以接受的。
Danisman and Alpkocak [2008] use a Vector Space Model (VSM) classifier in which each document is represented as a vector and each axis corresponds to a unigram word. The value of a word in a vector (a document) is calculated using TF-IDF. VSM is relying on two simplifying assumptions that documents with the same emotion form a contiguous region and a region of one emotion does not overlap with the others'. Having this model, on classification time, the test document is converted to a vector and the cosine angel between this vector and all other vectors in the model determines the similarity. They show that VSM outperforms SVM and Naïve Bayes classifiers on the SemEval 2007 dataset.
Danisman 和 Alpkocak[2008]使用向量空间模型(VSM)分类器,将每个文档表示为一个向量,每个轴对应一个单字符词。向量(文档)中单词的值使用 TF-IDF 计算。VSM 依赖于两个简化假设,即具有相同情感的文档形成一个连续的区域,以及一种情感的区域不会与其他情感的区域重叠。有了这个模型,在分类时,测试文档会被转换成一个向量,而这个向量与模型中所有其他向量之间的余弦天使决定了相似度。他们的研究表明,在 SemEval 2007 数据集上,VSM 的表现优于 SVM 和 Naïve Bayes 分类器。
Gupta et al. [2013] use an algorithm from the boosting family, namely "Boostexter," that was initially proposed in Schapire [1999]. Each base classifier in Boostexter assigns a confidence value in addition to its prediction for each instance. For a test instance, the final classifier outputs the sum of all confidences of all classifiers per class.
Gupta 等人[2013]使用的是一种提升系列算法,即 "Boostexter",该算法最初由 Schapire [1999] 提出。Boostexter 中的每个基本分类器除了对每个实例进行预测外,还会分配一个置信度值。对于测试实例,最终分类器会输出每个类别中所有分类器的置信度总和。
They also show the effectiveness of using a set of "salient features" that are essentially some linguistic clues from a dataset of customers' emails to the customer service department of some companies. These salient features include negative emotions, negative opinions, and other expressions specific to the domain of customer care such as threats to take their business elsewhere, and so on. According to their results, adding salient features to traditional n-gram features improves the performance significantly.
他们还展示了使用一组 "突出特征 "的有效性,这些特征主要是从客户发送给某些公司客服部门的电子邮件数据集中获得的一些语言线索。这些突出特征包括负面情绪、负面意见以及客户服务领域特有的其他表达方式,如威胁要把业务转到其他地方等等。根据他们的研究结果,在传统的 n-gram 特征中添加突出特征可显著提高性能。
Following a psychologically based approach, Ho and Cao [2012] use a high-order Hidden Markov Model (HMM) to address the emotion classification problem on the ISEAR dataset. They believe that emotion is the result of a sequence of mental states. Their idea is to transform the input text into a sequence of events that cause mental states and then automatically generate an HMM to model the process where this sequence of events causes the emotion. They get modest results over the four emotions of anger, fear, joy, and sadness, where anger includes both anger and disgust.
按照一种基于心理学的方法,Ho 和 Cao [2012] 使用高阶隐马尔可夫模型(HMM)来解决 ISEAR 数据集上的情感分类问题。他们认为情绪是一系列心理状态的结果。他们的想法是将输入文本转化为一系列导致心理状态的事件,然后自动生成一个 HMM 来模拟这一系列事件导致情绪的过程。他们在愤怒、恐惧、喜悦和悲伤这四种情绪上取得了适度的结果,其中愤怒包括愤怒和厌恶。
As stated in Section 4, "mood" is a less-intense state compared to emotion but has long-term effects. Mood classification, thus, is very similar to emotion classification and is partially addressed in the literature such as in G. Mishne's work [Mishne 2005]. Mishne [2005] attempts to classify blog posts into 1 of 40 moods, including excited, sleepy, confused, crazy, and so on. The author focuses mostly on feature selection by investigating the effectiveness of length-related and semantic-oriented features, frequencies of Part Of Speech (POS) tags, Pointwise Mutual Information (PMI) for each word and mood, and emphasized words. They believe that, due to the subjective nature of mood categories and annotations in the corpus, good results are not achieved.
如第 4 节所述,"情绪 "与情感相比是一种不那么强烈的状态,但具有长期影响。因此,情绪分类与情感分类十分相似,G. Mishne 的著作[Mishne 2005]等文献对情绪分类进行了部分论述。Mishne [2005] 尝试将博文分为 40 种情绪中的一种,包括兴奋、困倦、困惑、疯狂等。作者主要侧重于特征选择,研究了长度相关特征和语义导向特征、语篇(POS)标签频率、每个词和心情的点互信息(PMI)以及强调词的有效性。他们认为,由于语料库中情绪类别和注释的主观性,没有取得良好的效果。

5.1. Multi-Label Emotion Classification Research

In machine learning, multi-label classification algorithms are traditionally categorized into two classes: algorithm adaptation methods and problem transformation methods. The idea of the first approach is to adapt the existing single-label classification algorithm to enable it to classify multi-labeled data. In the second approach, using some transformation techniques, the multi-labeled data are transformed into another problem space, in which they have a single label and then a single-label classifier is applied on them [Bhowmick 2009]. In what follows, some of the multi-label emotion classifiers are introduced.
Given different single labels, Bhowmick [2009] uses an ensemble-based approach, called "random k-label sets classifier," which basically consists of an ensemble of "Label Powerset" (LP) classifiers. Each LP learns one single classifier with possible labels, where and is trained using a different small random subset of all emotions. A test instance is classified by combining votes from individual LP classifiers such that it is labeled with an emotion if the average vote of all classifiers is greater than a user-specified threshold. This work is an example of algorithm adaptation methods. Additionally, they explore the effectiveness of different feature sets such as polarity of subject, object, and verbs in sentences and semantic frame features using the Berkeley FrameNet lexicon [Baker et al. 1998]. Results of their experiments on a dataset of Indian news headlines reveal that the combination of polarity and semantic features is the best choice for a multi-label environment.
鉴于 不同的单一标签,Bhowmick[2009]使用了一种基于集合的方法,称为 "随机 k 标签集分类器",它基本上由 "Label Powerset"(LP)分类器集合组成。每个 LP 使用 个可能的标签学习一个分类器,其中 和使用所有情绪的不同小随机子集进行训练。测试实例的分类方法是综合单个 LP 分类器的投票结果,如果所有分类器的平均投票结果大于用户指定的阈值,则将该实例标记为一种情绪。这项工作是算法适应方法的一个范例。此外,他们还探索了不同特征集的有效性,如句子中主语、宾语和动词的极性,以及使用伯克利 FrameNet 词库(Baker 等人,1998 年)的语义框架特征。他们在印度新闻标题数据集上的实验结果表明,极性和语义特征的组合是多标签环境下的最佳选择。
Luyckx et al. [2012] is another work on multi-label classification of emotional texts. They focus on a dataset of notes written by people who have committed suicide that is provided for track 2 of the medical Natural Language Processing (NLP) shared task, The task is to predict label(s) of a note among 15 possible emotions, such as hopelessness, love, pride, thankfulness, and so on. We think that it is dubious to consider some of these labels, such as instructions, information, and so on, as emotions. First, they split all multi-labeled notes to single-labeled fragments manually. Then an SVM
Luyckx 等人[2012]是另一项关于情感文本多标签分类的研究。该数据集是为医学自然语言处理(NLP)共享任务 的第 2 赛道提供的。任务是在 15 种可能的情绪(如绝望、爱、自豪、感恩等)中预测纸条的标签。我们认为,将其中一些标签(如指令、信息等)视为情绪是可疑的。首先,他们手动将所有多标签笔记分割成单标签片段。然后用 SVM
Table V. Summary of Current Emotion-Mining Methods
表 V. 当前情感挖掘方法概述
Name Dataset Emotions Multi-label Method
C. Alm et al. [2005]
C.Alm 等人 [2005]
fairy tales
categorizing anger, 对愤怒进行分类、
disgust, fear, joy, sadness,
positive surprise, and 积极的惊喜,以及
negative surprise into 负面惊喜变成
positive, negative, and 正、负和
Sparse Network of 稀疏网络
G. Mishne [2005] G. Mishne [2005] LiveJournal 40 moods No
Support Vector 支持向量
A. Neviarouskaya A.Neviarouskaya
et al. [2007] 等人[2007]
160 sentences 160 句
from online
blog posts
anger, disgust, fear, guilt,
interest, joy, sadness, 兴趣、喜悦、悲伤
shame, surprise 羞愧、惊讶
No Rule Based
F. R. Chaumartin F.R. Chaumartin
SemEval 2007
anger, disgust, fear, joy,
sadness, surprise 悲伤、惊讶
No Rule Based
C. Strapparava and C.Strapparava 和
R. Mihalcea [2008] R.Mihalcea [2008]
SemEval 2007
anger, disgust, fear, joy,
sadness, surprise 悲伤、惊讶
(1) unsupervised: (1) 无人监督:
knowledge based, 以知识为基础、
(2) supervised: Naive (2) 有监督:奈维
T. Danisman and A.
T.Danisman 和 A.
Alpkocak [2008]
SemEval 2007
anger, disgust, fear, joy,
No Vector Space Model 矢量空间模型
A. Neviarouskaya A.Neviarouskaya
et al. [2009] 等人[2009]
diarylike blog 日记体博客
anger, disgust, fear, guilt,
interest, joy, sadness, 兴趣、喜悦、悲伤
shame, surprise 羞愧、惊讶
No Rule Based
P. K. Bhowmick P.K. Bhowmick
Indian news
disgust, fear, happiness,
ensemble of Label 标签的合奏
Powerset classifiers Powerset 分类器
S. Kim et al. [2010]
S.Kim 等人 [2010]
SemEval 2007, SemEval 2007、
ISEAR, and
fairy tales
anger, fear, joy, sadness
unsupervised: lexical 无监督:词法
D. T. Ho and T. H.
D.T. Ho 和 T. H.
anger (including disgust),
fear, joy, and sadness
No Hidden Markov Model 隐马尔可夫模型
K. Luyckx et al.
K.Luyckx et al.
600 suicide
notes for track 曲目说明
2 of the 2011
2011 年第 2 期
medical NLP
instructions, hopelessness,
love, information, guilt,
blame, thankfulness, 责备,感恩、
anger, sorrow, hopefulness,
fear, happiness_- 恐惧、幸福_-
peasefulness, pride, abuse,
Support Vector 支持向量
N. Gupta et al. [2013]
set of 1,077
factual, emotional 事实、情感 No Boosting
M. C. Jain and V. Y.
M.C. Jain 和 V. Y.
Kulkarni [2014] 库尔卡尼[2014]
anger, disgust, fear, joy,
sadness, surprise 悲伤、惊讶
Support Vector 支持向量
with Radial Basis Function is trained on these single-labeled data. Finally, a threshold is set for SVM's probability estimated for each emotion; if the probability exceeds the threshold, then that emotion is assigned to the sentence. Their method has improved the recall compared to a baseline method with the cost of degrading the precision.
使用径向基函数对这些单标签数据进行训练。最后,为 SVM 估计出的每种情感的概率设置一个阈值;如果概率超过阈值,则将该情感分配给该句子。与基线方法相比,他们的方法提高了召回率,但代价是降低了精确度。
Table V shows a summary of the explained methods in this section, in chronological order. They are compared with respect to the dataset and set of emotions they use, as well as the main characteristics of their approach.
表 V 按时间顺序概括了本节所介绍的方法。比较了这些方法所使用的数据集和情绪集,以及其方法的主要特点。

5.2. Emotion Mining Research on Twitter

With more than 300 million active users and 500 million tweets per day, Twitter is a popular network for sharing personal feelings and moods with acquaintances and friends. Hence, significant research is devoted to Twitter data with the purpose of analyzing the emotions expressed in tweets. Being short and informal, having misspellings, and using hashtags, special symbols such as emoticons and emojis, short forms of words, and abbreviations are properties that discriminate tweets from normal texts and add to the complexity of the task.
Twitter 拥有 3 亿多活跃用户和每天 5 亿条推文,是一个与熟人和朋友分享个人情感和情绪的流行网络。因此,人们对 Twitter 数据进行了大量研究,目的是分析推文中表达的情感。推文简短、不拘形式、拼写错误、使用标签、特殊符号(如表情符号和表情符号)、单词的简短形式和缩写,这些特性将推文与普通文本区分开来,增加了任务的复杂性。
Bollen et al. [2011] analyze emotions of all tweets in a specific time frame. They use a psychometric test, named "Profile of Mood States" (POMS) consisting of 793 adjective terms, each related to a particular emotion. Then the probability of each tweet showing an emotion is calculated based on these features, and the results are aggregated over all the tweets of 1 day. Finally, the overall emotions of tweets are compared with global events of that period and some correlations are found. Although this method does not consider the reader's perspective, it may still be classified as a social emotion detection task, introduced earlier in this section.
Bollen 等人[2011]分析了特定时间段内所有推文的情绪。他们使用了一个名为 "情绪状态档案"(POMS)的心理测试,该测试由 793 个形容词组成,每个形容词都与一种特定情绪相关。然后,根据这些特征计算出每条推文显示某种情绪的概率,并将结果汇总到 1 天内的所有推文中。最后,将推文的整体情绪与该时段的全球事件进行比较,并找出一些相关性。虽然这种方法没有考虑读者的视角,但仍可归类为本节前面介绍的社会情感检测任务。
Hashtags are space-free phrases following the "#" character such as #mickeymouse and #iamhappy. They can be used as indexes to search for related content or grouping messages. Hashtags are widely used in Twitter as they convey valuable information in a short piece of text. Wang et al. [2012] build a dataset from Twitter, containing tweets and use hashtags as emotion labels. In order to validate this type of labeling, they select 400 tweets randomly and label them manually. Comparing manual labels and hashtag labels show acceptable consistency. Then they explore the effectiveness of different features such as n-grams, different lexicons, POS, and adjectives in detecting emotions. Their best result is obtained when unigrams, bigrams, lexicons, and POS are used. Finally, they show that increasing the size of the training set has a direct effect on accuracy. While their dataset is a good source of emotional tweets, it is highly imbalanced, and the use of some unclear hashtags as emotion labels, such as #embarrass for sadness, makes soundness of the dataset open to criticism.
标签是 "#"字符后的无空格短语,如 #mickeymouse 和 #iamhappy。它们可用作搜索相关内容或分组信息的索引。标签在 Twitter 中被广泛使用,因为它们能在简短的文本中传达有价值的信息。Wang 等人[2012]从 Twitter 中建立了一个数据集,其中包含 条推文,并使用标签作为情绪标签。 为了验证这种类型的标签,他们随机选取了 400 条推文,并对其进行手动标签。比较人工标签和标签后发现,两者的一致性是可以接受的。然后,他们探索了不同特征(如 n-grams、不同词库、POS 和形容词)在检测情感方面的有效性。当使用单字符串、双字符串、词库和 POS 时,他们获得了最佳结果。最后,他们表明,增加训练集的规模对准确率有直接影响。虽然他们的数据集是情绪推文的良好来源,但它极不平衡,而且使用了一些不明确的标签作为情绪标签,如#embarrass 表示悲伤,这使得数据集的合理性受到了批评。
Hasan et al. [2014] also validate the use of hashtags as emotion labels on a set of 134,000 tweets. To this end, they compare hashtag labels with labels assigned by a group of people as well as those assigned by a group of psychologists. They found that crowd labels are not in agreement even with themselves; however, psychologists' labels are more consistent and show more agreement with hashtags, too. Therefore, they cast doubt on the use of crowd labeling such as in Amazon's Mechanical Turk for tasks related to emotion mining. They also introduce a supervised classifier, named "EmoTex." It essentially uses the feature set of unigrams, list of negation words, emoticons, and punctuations and runs K-Nearest Neighbors (KNN) and SVM on the training data.
Hasan 等人[2014]也在一组 134,000 条推文中验证了使用标签作为情绪标签的有效性。为此,他们将哈希标签与一群人以及一群心理学家指定的标签进行了比较。他们发现,人群的标签甚至与他们自己的标签也不一致;然而,心理学家的标签则更加一致,而且与标签也显示出更多的一致性。因此,他们对使用人群标签(如亚马逊的 Mechanical Turk)来完成与情感挖掘相关的任务表示怀疑。他们还介绍了一种名为 "EmoTex "的监督分类器。它主要使用单字符集、否定词列表、表情符号和标点符号等特征集,并在训练数据上运行 K-Nearest Neighbors (KNN) 和 SVM。
Roberts et al. [2012] create a corpus of 7,000 manually labeled tweets that are retrieved by searching for 14 emotion evoking topics, such as World Cup and Christmas. There are a total of seven emotions where each tweet can have zero, one, or many of them. Seven binary SVMs, one for each emotion and each with a different feature set, are trained. Features include n-grams, punctuation, hypernyms, and topics. To obtain topics, they assume that each tweet associates with a probabilistic mixture of topics, and they are inferred using LDA. Their best performance was over the emotion fear, which led them to infer that fear is highly lexicalized with less variation than other emotions.
罗伯茨等人[2012]创建了一个由 7000 条人工标注的推文组成的语料库,这些推文是通过搜索世界杯和圣诞节等 14 个唤起情绪的主题而检索到的。共有七种情绪,每条推文可以包含零、一种或多种情绪。我们训练了七种二元 SVM,每种情绪一种 SVM,每种 SVM 有不同的特征集。特征包括 n-词组、标点符号、超词和主题。为了获得主题,他们假定每条推文都与主题的概率混合物相关联,并使用 LDA 进行推断。他们在情绪 "恐惧 "方面的表现最好,这使他们推断出 "恐惧 "的词汇化程度很高,与其他情绪相比,"恐惧 "的变化较少。
Mohammad [2012] introduced his corpus, called the "Twitter Emotion Corpus" (TEC), collected from Twitter, that will be explained in Section 6.2 and, similarly to Roberts et al. [2012], built binary SVMs, one for each emotion, using unigrams and bigrams as features. He then showed the effectiveness of this corpus in cross-domain classifications by using these data to predict emotions on another dataset, SemEval 2007. He also built a lexicon from this corpus that will be introduced in Section 6.1.
默罕默德[2012]介绍了他的语料库,称为 "Twitter 情感语料库"(TEC),该语料库从 Twitter 收集而来,将在第 6.2 节中进行说明,与罗伯茨等人[2012]的做法类似,他使用单字符串和双字符串作为特征,建立了二元 SVM,每种情感一个 SVM。然后,他利用这些数据在另一个数据集 SemEval 2007 上预测情绪,从而展示了该语料库在跨领域分类中的有效性。他还从这个语料库中建立了一个词典,将在第 6.1 节中介绍。
Table VI. Summary of Current Emotion-Mining Methods on Twitter
表 VI.当前 Twitter 上的情感挖掘方法概述
Name Dataset Emotions Method
J. Bollen et al.
J.Bollen et al.
crawled about 爬来爬去
tension, depression, 紧张、抑郁、
anger, vigour, fatigue, 愤怒、活力、疲劳、
Profile of Mood States
no labeling
W. Wang et al.
W.Wang et al.
crawled about 爬来爬去
anger, fear, joy, love,
sadness, surprise, 悲伤、惊喜
linear classifier 线性分类器 using hashtags 使用标签
K. Roberts
et al. [2012] 等人[2012]
crawled 7,000 爬了 7000
tweets from 14 14 条微博
evoking topics 唤起话题
anger, disgust, fear, 愤怒、厌恶、恐惧
joy, love, sadness, 快乐、爱、悲伤
Support Vector 支持向量
S. M.
built TEC by
crawling about 爬来爬去
21,000 tweets 21,000 条微博
anger, disgust, fear, 愤怒、厌恶、恐惧
joy, sadness, surprise 喜悦、悲伤、惊喜
Support Vector 支持向量
using hashtags 使用标签
M. Hasan
crawled about 爬来爬去
134,000 tweets 13.4 万条推文
model: active, inactive /
happy, unhappy 快乐,不快乐
Support Vector 支持向量
Machine and
K-Nearest Neighbors K 最近邻
using hashtags 使用标签
W. Li and H.
Xu [2014]
16,485 posts
from Weibo, a 来自微博
anger, disgust, fear, 愤怒、厌恶、恐惧
joy, sadness, surprise 喜悦、悲伤、惊喜
Support Vector 支持向量
Table VI depicts the summary of the explained methods working on Twitter data, sorted in chronological order. They are compared for the dataset and set of emotions they use, as well as the main characteristics of their approach.
表 VI 按时间顺序概述了已解释的 Twitter 数据处理方法。表 VI 比较了这些方法所使用的数据集和情绪集,以及其方法的主要特点。

5.3. Emotion Mining for Other Languages

Most of the work in textual emotion mining is on the English language; nevertheless, it is worth mentioning the few works done on other languages, since the ideas and techniques may still be used in a language-agnostic way.
Li and Xu [2014] try to detect emotions from messages in Weibo, a Chinese microblog website with functionalities thoroughly similar to Twitter. They believe that the accuracy of detecting emotions in a text can be increased if we look for the events that cause emotions. In this manner, their work is similar to Ho and Cao [2012]. Therefore, they adopt the notion of cause events that are meant to be the reasons of certain emotions. To spot cause events and use them as features, they exploit a marker list, containing keywords to mark the occurrence of cause events; an emotion list, containing keywords expressing emotions; and a linguistic pattern set, describing how emotions and cause events are arranged in a text. All of these resources are adapted to the informal environment of Weibo. Then a "Support Vector Regression" (SVR), an algorithm from the family of SVMs, is trained using these features. According to the results, performance is boosted for some emotions, although it is decreased for others, such as fear and sadness. Lei et al. [2014] is another example of an emotion-mining study in Chinese that will be explained in Section 6.1. Also, the aforementioned method of Bhowmick [2009] has addressed the emotion-mining task on an Indian dataset in a multi-label environment.
Li 和 Xu [2014] 尝试从微博中的信息中检测情绪,微博是中国的一个微博网站,其功能与 Twitter 完全相似。他们认为,如果我们寻找引起情绪的事件,就能提高检测文本中情绪的准确性。在这一点上,他们的工作与 Ho 和 Cao [2012] 类似。因此,他们采用了 "起因事件 "的概念,即某些情绪产生的原因。为了发现原因事件并将其用作特征,他们利用了一个标记列表(包含标记原因事件发生的关键词)、一个情感列表(包含表达情感的关键词)和一个语言模式集(描述情感和原因事件在文本中的排列方式)。所有这些资源都适用于微博的非正式环境。然后,利用这些特征对 SVM 家族中的 "支持向量回归"(SVR)算法进行训练。结果显示,某些情绪的性能得到了提升,而其他情绪(如恐惧和悲伤)的性能则有所下降。Lei 等人[2014]是另一个用中文进行情感挖掘研究的例子,将在第 6.1 节中进行说明。此外,前面提到的 Bhowmick [2009] 的方法也在多标签环境下处理了印度数据集上的情感挖掘任务。
In addition, since most of the tools for emotion mining are built for the English language, a portion of the works are dedicated to providing resources specific for other languages by either developing a resource from scratch or adapting existing English resources. Examples of adapting SentiWordNet for the Indian and Vietnamese languages are presented in Section 3.1.5.
此外,由于大多数情感挖掘工具都是针对英语语言开发的,因此有一部分工作致力于通过从头开始开发资源或改编现有的英语资源来提供针对其他语言的资源。第 3.1.5 节介绍了为印度语和越南语改编 SentiWordNet 的例子。
Table VII. Summary of Emotion-Related Lexicons
表 VII.与情感有关的词库摘要
Name Author Year
Set of Emotions 一组情感
C. Strapparava C.Strapparava 2004 4,787 a hierarchy of emotions
LIWC J. W. Pennebaker J.W. 彭内贝克 2007 5,000
affective or not, positive, negative,
anxiety, anger, sadness 焦虑、愤怒、悲伤
NRC S. M. Mohammad S.M. Mohammad 2010 14,182
anger, fear, anticipation, trust,
surprise, sadness, joy, disgust
NRC hashtag S. M. Mohammad S.M. Mohammad 2013 32,400
anger, fear, anticipation, trust,
surprise, sadness, joy, disgust
A. Gholipour
2015 24,000
anger, fear, joy, love, sadness,
surprise, thankfulness, disgust, guilt

6.1. Lexicons 6.1.词典

Almost all of the emotion-mining works rely on using a lexicon. Lexicons are very useful in that they give prior information about the type and strength of emotion carried by each word or phrase. In this section, we introduce some of the lexicons useful for the emotion-mining task. Their characteristics are summarized in Table VII.
几乎所有的情感挖掘工作都依赖于使用词典。词典非常有用,因为它们提供了关于每个词或短语所承载的情感类型和强度的先验信息。在本节中,我们将介绍一些对情感挖掘任务有用的词典。表 VII 总结了这些词典的特点。
6.1.1. Wordnet Affect. Wordnet Affect is an emotional lexical resource, including a list of sets of synonym words, referred to as synsets. The set of emotions in this lexicon is hierarchically organized. Strapparava and Valitutti [2004] build this lexicon on top of their previous lexicon, Wordnet. They manually form an initial set of 1,903 affective words and expand them by adding their corresponding nouns, verbs, adjectives, adverbs, and so on. Then a subset of synsets of Wordnet that contain at least one of these affective words are selected, and the rest are rejected. This forms the core of the lexicon. Then the lexical and semantic relations between synsets of this core lexicon and other synsets of Wordnet are examined to see if they preserve the affective meaning represented by those core synsets. After adding new synsets, Wordnet Affect contains 2,874 synsets and 4,787 words. One interesting feature of this lexicon is the notion of stative/causative for words. A word is causative if it refers to an emotion that is caused by that entity (e.g., amusing). On the other hand, a word is said to be stative if it refers to the emotion owned or felt by that subject (e.g., amused).
6.1.1.词网情感。Wordnet Affect 是一个情感词库,包括一组同义词,称为同义词集。该词库中的情绪集是按层次组织的。Strapparava 和 Valitutti [2004] 在他们以前的词库 Wordnet 的基础上建立了这个词库。他们手动形成了一个包含 1903 个情感词的初始词集,并通过添加相应的名词、动词、形容词、副词等进行扩展。然后,从 Wordnet 的同义词集中选出至少包含其中一个情感词的子集,其余的则被剔除。这就构成了词库的核心。然后检查核心词库中的同义词集与 Wordnet 中其他同义词集之间的词汇和语义关系,看它们是否保留了这些核心同义词集所代表的情感意义。添加新的词组后,Wordnet Affect 包含 2,874 个词组和 4,787 个单词。该词典的一个有趣特征是词的定语/因果概念。如果一个词指的是由该实体引起的情绪(如 "有趣"),那么这个词就是因果词。另一方面,如果一个词指的是该主体所拥有或感受到的情感(如 "好笑"),那么这个词就被称为 "助动词"。
6.1.2. LIWC. The Linguistic Inquiry and Word Count (LIWC) is another emotionrelated lexicon developed by Pennebaker et al. [2007]. In the first step of generating this lexicon, some initial category scales are generated in a psychological process and then various scales are added to initial lists by brain-storming sessions. In the next step, three independent judges rate the words in two phases, such that after completion of each phase, all category scale lists are updated according to judges' rates. The initial LIWC judging took place in 1992 and, since then, it has been updated and largely expanded.
6.1.2.LIWC.Linguistic Inquiry and Word Count (LIWC) 是 Pennebaker 等人[2007]开发的另一个与情绪相关的词典。在生成该词典的第一步,通过心理过程生成一些初始类别量表,然后通过脑力激荡会议将各种量表添加到初始列表中。下一步,三位独立评委分两个阶段对词语进行评分,每个阶段结束后,所有类别标尺列表都会根据评委的评分进行更新。最初的 LIWC 评判工作于 1992 年进行,此后不断更新并在很大程度上扩大了评判范围。
6.1.3. NRC. Mohammad and Turney [2010] develop the NRC word-emotion association lexicon. Using Amazon's Mechanical Turk, they asked Turkers to annotate words, from non-specific domains, according to the emotion they evoke. One important challenge in this process is malicious annotations that can happen in cases where words in different senses evoke different emotions. To solve this problem, the target sense needs to be conveyed to annotators. Hence, they asked additional questions from Turkers, including word choice questions, that help identify instances where the annotator may not be familiar with the target term. In addition to building a lexicon, they concluded that a regular crowd can produce reliable emotion annotation, given proper guidelines. This is in contrast with findings of Hasan et al. [2014], who showed that crowd labeling of emotional tweets have quite low inter-agreement with each other and with emotional hashtags of tweets.
6.1.3.NRCMohammad 和 Turney [2010] 开发了 NRC 词-情感关联词典。 他们利用亚马逊的 Mechanical Turk,要求 Turkers 根据单词所唤起的情感来注释非特定领域的单词。在这个过程中,一个重要的挑战是在不同意义的单词唤起不同情感的情况下可能出现的恶意注释。为了解决这个问题,需要向注释者传达目标词义。因此,他们向特克尔提出了更多问题,包括选词问题,以帮助识别注释者可能不熟悉目标术语的情况。除了建立词典之外,他们还得出结论认为,只要有适当的指导原则,普通人群也能做出可靠的情感注释。这与 Hasan 等人[2014]的研究结果形成了鲜明对比,他们的研究表明,人群对情感推文的标注相互之间以及与推文情感标签之间的一致性相当低。
6.1.4. NRC Hashtag. In another attempt, the main author of the NRC lexicon, S. M. Mohammad, developed another useful lexicon, called the "NRC hashtag emotion lexicon"15 [Mohammad 2012]. Using a corpus of 21,000 tweets (TEC), the Strength of Association (SoA) for an n-gram and an emotion is calculated to be
6.1.4.NRC 标签。在另一次尝试中,NRC 词典的主要作者 S. M. Mohammad 开发了另一个有用的词典,名为 "NRC 标签情感词典 "15 [Mohammad 2012]。使用 21,000 条推文(TEC)的语料库,计算出一个 n-gram 和一个情感 的关联强度(SoA)为
where is the pointwise mutual information, calculated as
其中 是点互信息,计算公式为
where is the number of times that occurs in a tweet that has the label , and freq and freq(e) are hte frequencies of and , respectively, in the corpus. is calculated likewise. Words having SoA greater than zero are kept in the lexicon.
其中 在标有 的推文中出现的次数,freq 和 freq(e) 分别是 在语料库中出现的频率。 的计算方法与此类似。词库中保留 SoA 大于零的词。
6.1.5. Clean Balanced Emotional Tweets (CBET). This lexicon is compiled by Gholipour Shahraki [2015] from the single-labeled part of a dataset with the same name. This dataset contains a large number of tweets, each labeled with one single emotion (for more information about it, see Section 6.2). The lexicon is actually a matrix, where is the set of all the single words (unigrams) contained in CBET dataset and is the set of emotions covered in it. The element at index of the matrix denotes the degree that word expresses emotion . In other words, each entry of the lexicon has a corresponding weight vector that contains weights associated to each of the participating emotions. The weight is calculated as the number of times that has occurred in tweets that have the label in the dataset. That is,
6.1.5.清洁平衡情感推文(CBET)。该词典 由 Gholipour Shahraki [2015] 根据同名数据集的单一标签部分编制而成。该数据集包含大量推文,每条推文都标注了一种情绪(更多信息请参见第 6.2 节)。词库实际上是一个 矩阵,其中 是 CBET 数据集中包含的所有单字(unigrams)的集合, 是其中涵盖的情感集合。矩阵索引 中的元素表示单词 表达情感 的程度。换句话说,词库的每个条目都有一个相应的权重向量,其中包含与每个参与情感相关的权重。权重 的计算方法是, 在数据集中标签为 的推文中出现的次数。也就是说
where is the presence of emotion given tweet and is an indicator function that is equal to 1 if and is 0 otherwise. The naïve assumption supporting this idea is that all the words in a tweet are in agreement with the label of that tweet. The CBET lexicon is the newest emotion lexicon; it is publicly available and covers more emotions compared to all previous ones.
其中, 是推文 中出现的情感 是一个指标函数,如果 则等于 1,否则为 0。支持这一想法的天真假设是,推文中的所有单词都与该推文的标签一致。CBET 词库是最新的情感词库;它是公开可用的,与之前的所有词库相比涵盖了更多的情感。
In addition to these publicly available lexicons, there are other lexicons generated for specific tasks that are not accessible; nevertheless, reviewing their method of generation can still provide some ideas if one wants to build his/her own special-purpose lexicon.
6.1.6. Word-Emotion Mapping Lexicon. Katz et al. [2007] create a word-emotion mapping from the SemEval 2007 dataset that will be introduced in Section 6.2. A weight vector is assigned to each lemmatized word from the corpus such that each element in this vector corresponds to one emotion. The value of this element then is calculated to be the average emotion score observed in all samples in which participated.
6.1.6.词-情映射词典。Katz 等人[2007]从 SemEval 2007 数据集中创建了词-情感映射,这将在第 6.2 节中介绍。语料库中的每个词 都分配了一个权重向量,该向量中的每个元素都对应一种情感。然后计算出该元素的值,即 参与的所有样本中观察到的平均情感得分。
Table VIII. Summary of Emotion-Related Datasets
表 VIII.与情感相关的数据集摘要
Name Author Year Size Type of Data
ISEAR K. R. Scherer K.R. 舍勒 1997 7,666 crowd written paragraphs
fairy tales C. Ovesdotter Alm C.Ovesdotter Alm 2005 15,000 sentences from children's stories
SemEval C. Strapparava C.Strapparava 2007 1,250 news headlines 新闻头条
TEC S. M. Mohammad S.M. Mohammad 2012 21,000 tweets
CBET A. Gholipour Shahraki A.Gholipour Shahraki 2015 81,163 tweets
6.1.7. Chinese Lexicon. Lei et al. [2014] propose a framework of generating a domainand context-dependent emotion lexicon. First, they select a well-formed training set from the corpus of news headlines taken from the Sina website, a popular news site in China. The criterion for selecting a headline is to be among those with the highest rating for at least one emotion. Next, the lexicon is built such that for each word and each emotion :
6.1.7.中文词典。Lei 等人[2014]提出了一个生成与领域和语境相关的情感词典的框架。首先,他们从中国流行的新闻网站新浪网的新闻标题语料库中选择了一个格式良好的训练集。选择标题的标准是标题中至少有一种情绪的评分最高。接下来,建立词库,使每个词 和每种情绪
where is the relative term frequency of in document is the co-occurrence number of document and emotion , and is the prior probability of document . Results of their experiments show an improvement over existing lexicon generation methods such as in Katz et al. [2007].
其中, 在文档 中的相对词频,是文档 和情感 的共现次数, 是文档 的先验概率。他们的实验结果表明,与现有的词典生成方法(如 Katz 等人[2007]的方法)相比,该方法有所改进。

6.2. Datasets 6.2.数据集

One of the old challenges in most machine-learning works is collecting data, especially labeled data. Apart from the costs of manual labeling, in the specific problem of emotion annotation, results are often subject to misunderstandings, subjective interpretations of annotators, their personality, the perspective that the content is analyzed, and so on [Alm 2008]. In this section, we introduce some useful datasets that have a reliable labeling process and/or are widely used. Table VIII shows a summary of these datasets.
收集数据,尤其是标注数据,是大多数机器学习研究的老难题之一。除了人工标注的成本外,在情感标注这个特殊问题上,结果往往会受到误解、标注者的主观解释、标注者的个性、分析内容的角度等因素的影响 [Alm 2008]。在本节中,我们将介绍一些有用的数据集,这些数据集具有可靠的标注过程和/或被广泛使用。表 VIII 显示了这些数据集的摘要。
6.2.1. ISEAR. Scherer and Wallbott [1994] present one of the oldest emotion-labeled datasets, ISEAR, which is freely available for download. The data were collected during the 1990s by a large group of psychologists all over the world, who were working on the ISEAR project. In this survey, 3,000 students, including both psychologists and non-psychologists, from 37 countries on all five continents were asked to report situations in which they had experienced the following seven major emotions: joy, fear, anger, sadness, disgust, shame, and guilt. In what they write, respondents should explain how they had appraised the situation and how they reacted. For non-English speakers, the text was translated to English. Hence, the format of the data is a sentence or paragraph, labeled with exactly one emotion. This dataset is reliable in terms of labeling, since the authors themselves have annotated their text. However, translating from other languages to English might change the sense and emotions. Surprisingly, ISEAR was not used for emotion-mining purposes until 2008.
6.2.1.ISEARScherer 和 Wallbott [1994] 提出了历史最悠久的情绪标签数据集之一 ISEAR,该数据集可免费下载。 这些数据是在 20 世纪 90 年代由一大批世界各地的心理学家收集的,他们当时正在进行 ISEAR 项目。在这项调查中,来自五大洲 37 个国家的 3000 名学生(包括心理学家和非心理学家)被要求报告他们经历过的以下七种主要情绪:喜悦、恐惧、愤怒、悲伤、厌恶、羞愧和内疚。受访者应在所写的内容中解释他们如何评价当时的情况以及他们的反应。对于非英语国家的受访者,问卷文本被翻译成英语。因此,数据格式为一个句子或段落,标注一种情绪。这个数据集在标注方面是可靠的,因为作者自己对文本进行了标注。不过,从其他语言翻译成英语可能会改变意义和情感。令人惊讶的是,直到 2008 年,ISEAR 才被用于情感挖掘。
6.2.2. Fairy Tales. A set of fairy tales is another dataset developed by Alm and Sproat [2005]. It contains 185 children's stories written by Beatrix Potter, Brothers Grimm, and Hans Christian Andersen, with a total of about 15,000 sentences that are labeled by one of the following emotions: anger, disgust, fear, happiness, sadness, positively surprised, negatively surprised, or neutral if it does not show any emotion. The annotation was done manually by six female native English speakers. Note that, unlike the ISEAR dataset, in which texts are annotated on the document level, in the fairy tales dataset, annotation is done on the sentence level.
6.2.2.童话故事童话故事集是 Alm 和 Sproat [2005] 开发的另一个数据集 。该数据集包含 185 个由 Beatrix Potter、格林兄弟和安徒生编写的儿童故事,共有约 15,000 个句子,这些句子被标注为以下情绪之一:愤怒、厌恶、恐惧、快乐、悲伤、正面惊讶、负面惊讶,或中性(如果未显示任何情绪)。标注工作由六位以英语为母语的女性人工完成。需要注意的是,与伊斯艾尔数据集在文档层面对文本进行注释不同,在童话故事数据集中,注释是在句子层面进行的。
6.2.3. SemEval 2007. Strapparava and Mihalcea [2007] developed a dataset for the SemEval 2007 workshop on the shared task of affective computing. It consists of news headlines from major newspapers such as The New York Times, CNN, and BBC News, as well as the Google News search engine. The annotation was done manually by six annotators, and the set of labels includes six emotions: anger, disgust, fear, joy, sadness, and surprise. Instead of the usual 0/1 binary annotation, they run a finer-grained labeling process. An interval is set for each emotion and the annotator decides to what degree from 0 to 100 the headline shows that emotion. Hence, a headline can have multiple emotions, each with a different degree. To justify why news are selected to build this dataset, they claim that news have typically a high load of emotional content and are written in a style meant to attract readers' attention. In fact, there is a popular concept in the news world, called "Emotional Framing" [Corcoran 2006], positing that each news item is shaped to a form of story with layers of dramatic frames, such as fear caused by danger or alarming news. Although this idea backs up the development of the SemEval 2007 dataset, our statistical analyses show that the data are most likely to be neutral and there is not much tangible emotion expressed by news. For example, the average degree of all emotions for a headline is only 15.48 (of 100) on average. Also, only , and of headlines express anger, disgust, fear, joy, sadness, and surprise, respectively, with a degree more than 50 .
6.2.3.SemEval 2007。Strapparava 和 Mihalcea [2007] 为 SemEval 2007 研讨会开发了一个关于情感计算共享任务的数据集。 该数据集由《纽约时报》、CNN 和 BBC News 等主要报纸以及谷歌新闻搜索引擎的新闻标题组成。标注工作由六位标注者手工完成,标签集包括六种情绪:愤怒、厌恶、恐惧、喜悦、悲伤和惊讶。与通常的 0/1 二进制标注不同,他们采用了更精细的标注流程。他们为每种情绪设置了一个区间 ,由标注者决定标题在 0 到 100 之间的情绪程度。因此,一个标题可以有多种情绪,每种情绪的程度各不相同。为了证明为什么选择新闻来建立这个数据集,他们声称新闻通常都有大量的情感内容,其写作风格旨在吸引读者的注意力。事实上,新闻界有一个流行的概念,叫做 "情感框架"(Emotional Framing)[Corcoran,2006 年]。尽管这一观点支持 SemEval 2007 数据集的开发,但我们的统计分析显示,这些数据很可能是中性的,新闻所表达的实际情感并不多。例如,新闻标题的平均情感度仅为 15.48(满分 100)。此外,分别只有 的标题表达愤怒、厌恶、恐惧、喜悦、悲伤和惊讶的程度超过 50。
6.2.4. TEC. Mohammad [2012] created a corpus of emotional tweets from Twitter(TEC in 2012 . He targeted the following six basic emotions proposed by Ekman et al. [1972]: anger, disgust, fear, joy, sadness, and surprise, and chose six hashtags addressing these emotions (e.g., #anger, #disgust, etc.) to search for appropriate tweets using Twitter Search Application Program Interface (API) He discarded very short tweets, very badly spelled ones, and those with the prefix "RT," which are retweets of another tweet. He also removed the tweets that did not have the emotional hashtag at the end of the message, since he believed such hashtags may not be good indicators of the label of the tweet. After this post-processing, TEC includes 21,051 tweets where 7.4%, , and of the corpus have the labels anger, disgust, fear, joy, sadness, and surprise, respectively. This shows how imbalanced this dataset is.
6.2.4.TEC.Mohammad [2012] 从 Twitter 中创建了一个情感推文语料库(TEC ,2012 年)。他以 Ekman 等人[1972]提出的以下六种基本情绪为目标:愤怒、厌恶、恐惧、喜悦、悲伤和惊讶,并选择了六种针对这些情绪的标签(如 #anger、#disgust 等),使用 Twitter 搜索应用程序接口(API)搜索合适的推文 。他剔除了非常短的推文、拼写非常糟糕的推文以及带有前缀 "RT "的推文,"RT "是对另一条推文的转发。他还删除了信息末尾没有情感标签的推文,因为他认为这种标签可能不是推文标签的良好指标。经过这样的后处理,TEC 包含 21 051 条推文,其中 7.4%、 的语料分别带有愤怒、厌恶、恐惧、喜悦、悲伤和惊讶标签。由此可见,这个数据集是多么不平衡。
6.2.5. CBET. In 2015, Gholipour Shahraki [2015] compiled the Cleaned Balanced Emotional Tweets (CBET) dataset from Twitter using hashtags to search for tweets that have at least one of these nine emotions: anger, fear, joy, love, sadness, surprise, thankfulness, disgust, and guilt. The corpus is also preprocessed and cleaned. One interesting point in cleaning tweets exploited here is segmenting space-free phrases used as haghtags. For instance, the hashtag "#animalrights" is segmented to "animal" and "rights" while the original form of the hashtag is preserved as well. CBET has two parts: The larger part contains tweets that have exactly one label, referred to as singlelabeled samples. This part is perfectly balanced over labels, containing 76,860 tweets, with 8,540 for each emotion. The smaller part contains double-labeled tweets, that is, those that express two emotions simultaneously. The size of this portion is 4,303 and it is imbalanced, as not all combinations of emotions happen together equally frequently. The most frequent paired label is joy-love, while some pairs, such as anger-thankfulness, are very rare. The total number of 81,163 tweets in this dataset makes it the largest available corpus for emotion-mining research.
6.2.5.CBET2015 年,Gholipour Shahraki[2015]利用标签从 Twitter 上搜索出至少具有以下九种情绪之一的推文:愤怒、恐惧、喜悦、爱、悲伤、惊讶、感恩、厌恶和内疚,从而编制出了 Cleaned Balanced Emotional Tweets(CBET)数据集 。语料库也经过预处理和清理。这里利用的清理推文的一个有趣点是分割用作标签的无空间短语。例如,标签 "#animalrights "被分割为 "animal "和 "rights",同时保留了标签的原始形式。CBET 有两个部分:较大的部分包含只有一个标签的推文,称为单标签样本。这部分的标签完全均衡,包含 76860 条推文,每种情绪有 8540 条。较小的部分包含双标签推文,即同时表达两种情绪的推文。这部分的推文数量为 4,303 条,而且是不平衡的,因为并非所有的情绪组合在一起出现的频率都相同。最常见的成对标签是 "喜悦-爱",而一些成对标签,如 "愤怒-感激 "则非常罕见。该数据集共有 81,163 条推文,是目前用于情感挖掘研究的最大语料库。


In this survey, we introduced state-of-the-art methods and improvements on text sentiment analysis. Sentiment analysis refers to all the areas of detecting, analyzing, and evaluating humans' state of mind towards different topics of interest. In particular, text sentiment analysis aims to mine people's opinions, sentiments, and emotions based on their writings. Personal notes, emails, news headlines, blogs, tales, novels, chat messages, and social networking websites such as Twitter, Facebook, and MySpace are some types of text that can convey emotions.
在本调查中,我们介绍了文本情感分析的最新方法和改进。情感分析是指检测、分析和评估人类对不同兴趣主题的心理状态的所有领域。其中,文本情感分析旨在根据人们的文章挖掘他们的观点、情感和情绪。个人笔记、电子邮件、新闻标题、博客、故事、小说、聊天信息以及 Twitter、Facebook 和 MySpace 等社交网站都是可以传达情感的文本类型。
In this work, we suggested a careful categorization of tasks in this area and provided a clear and logical taxonomy of sentiment analysis work. There are two main subcategories in this field: opinion mining and emotion mining. The first one deals with the expression of opinions and the latter is concerned with the articulation of emotions. There is a rich body of research on opinion mining, and many new focused and specialized areas are investigated, while emotion mining from text is still in its infancy. Considering this fact and the strong link between them, we tried to give a comprehensive overview of the most recent trends and useful resources in opinion mining and emotion mining. Towards this goal, we first explained the key elements of the polarity classification task and reviewed those works in this area that can be useful for the emotion-mining task. Second, we introduced a set of important resources, including lexicons and datasets that researchers need for a polarity classification task. Third, we reviewed emotion theories as an introductory to the world of human emotions. A thorough survey on emotion-related research was given next and useful resources specific to emotion-mining work were introduced.


Cecilia Ovesdotter Alm, Dan Roth, and Richard Sproat. 2005. Emotions from text: Machine learning for text-based emotion prediction. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 579586 .
Cecilia Ovesdotter Alm、Dan Roth 和 Richard Sproat。2005.来自文本的情感:基于文本情感预测的机器学习。自然语言处理中的人类语言技术和经验方法会议论文集》。计算语言学协会,579586 .
Cecilia Ovesdotter Alm and Richard Sproat. 2005. Emotional sequencing and development in fairy tales. In Affective Computing and Intelligent Interaction. Springer, 668-674.
Cecilia Ovesdotter Alm and Richard Sproat.2005.童话故事中的情感排序与发展。In Affective Computing and Intelligent Interaction.Springer, 668-674.
Ebba Cecilia Ovesdotter Alm. 2008. Affect in Text and Speech. ProQuest.
Ebba Cecilia Ovesdotter Alm.2008.文本和语音中的情感》。ProQuest.
Alina Andreevskaia and Sabine Bergler. 2006. Mining wordnet for a fuzzy sentiment: Sentiment tag extraction from wordnet glosses. In , Vol. 6. 209-215.
Alina Andreevskaia and Sabine Bergler.2006.为模糊情感挖掘词网:从词网词汇中提取情感标签。 , Vol. 6. 209-215.
Anthony Aue and Michael Gamon. 2005. Customizing sentiment classifiers to new domains: A case study. In Proceedings of Recent Advances in Natural Language Processing (RANLP), Vol. 1. Citeseer.
Anthony Aue 和 Michael Gamon.2005.为新领域定制情感分类器:案例研究。自然语言处理最新进展(RANLP)论文集,第 1 卷。Citeseer.
Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. 2010. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC, Vol. 10. 2200-2204.
Stefano Baccianella、Andrea Esuli 和 Fabrizio Sebastiani。2010.SentiWordNet 3.0:用于情感分析和观点挖掘的增强型词汇资源。In LREC, Vol.2200-2204.
Collin F. Baker, Charles J. Fillmore, and John B. Lowe. 1998. The berkeley framenet project. In Proceedings of the 17th International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 86-90.
Collin F. Baker, Charles J. Fillmore, and John B. Lowe.1998.伯克利框架网项目。In Proceedings of the 17th International Conference on Computational Linguistics-Volume 1.计算语言学协会,86-90。
Farah Benamara, Carmine Cesarano, Antonio Picariello, Diego Reforgiato Recupero, and Venkatramana S. Subrahmanian. 2007. Sentiment analysis: Adjectives and adverbs are better than adjectives alone. In ICWSM.
Farah Benamara、Carmine Cesarano、Antonio Picariello、Diego Reforgiato Recupero 和 Venkatramana S. Subrahmanian。2007.情感分析:形容词和副词比单独的形容词更好。In ICWSM.
Plaban Kumar Bhowmick. 2009. Reader perspective emotion analysis in text through ensemble based multilabel classification framework. Comput. Inf. Sci. 2, 4 (2009), 64-74.
Plaban Kumar Bhowmick.2009.通过基于集合的多标签分类框架进行文本中的读者视角情感分析。Comput.Inf.2, 4 (2009), 64-74.
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3 (2003), 993-1022.
David M. Blei, Andrew Y. Ng, and Michael I. Jordan.2003.Latent dirichlet allocation.J. Mach.Learn.3 (2003), 993-1022.
John Blitzer, Mark Dredze, and Fernando Pereira. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In , Vol. 7. Citeseer, 440-447.
John Blitzer, Mark Dredze, and Fernando Pereira.2007.传记、宝莱坞、音箱和搅拌机:情感分类的领域适应。In , Vol. 7. Citeseer, 440-447.
Johan Bollen, Huina Mao, and Alberto Pepe. 2011. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In ICWSM.
约翰-博伦、毛慧娜和阿尔贝托-佩佩。2011.公众情绪和情感建模:推特情绪与社会经济现象。In ICWSM.
Carlos Busso, Zhigang Deng, Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Sungbok Lee, Ulrich Neumann, and Shrikanth Narayanan. 2004. Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th International Conference on Multimodal Interfaces. ACM, 205-211.
Carlos Busso、Zhigang Deng、Serdar Yildirim、Murtaza Bulut、Chul Min Lee、Abe Kazemzadeh、Sungbok Lee、Ulrich Neumann 和 Shrikanth Narayanan。2004.利用面部表情、语音和多模态信息进行情绪识别分析。第六届多模态界面国际会议论文集》。ACM,205-211。
François-Régis Chaumartin. 2007. UPAR7: A knowledge-based system for headline sentiment tagging. In Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics, 422-425.
François-Régis Chaumartin.2007.UPAR7:基于知识的标题情感标记系统。第四届语义评估国际研讨会论文集》。计算语言学协会,422-425。
P. E. Corcoran. 2006. Emotional framing in australian journalism. In Australian & New Zealand Communication Association International Conference, Adelaide, Australia (ANZCA).
P.E. Corcoran.2006.澳大利亚新闻中的情感框架。澳大利亚和新西兰传播协会国际会议,澳大利亚阿德莱德(ANZCA)。
Taner Danisman and Adil Alpkocak. 2008. Feeler: Emotion classification of text using vector space model. In AISB 2008 Convention Communication, Interaction and Social Intelligence, Vol. 1. 53.
Taner Danisman 和 Adil Alpkocak.2008.Feeler:使用向量空间模型对文本进行情感分类。In AISB 2008 Convention Communication, Interaction and Social Intelligence, Vol.53.
Amitava Das and Sivaji Bandyopadhyay. 2010. SentiWordNet for indian languages. Asian Federation for Natural Language Processing, China (2010), 56-63.
Amitava Das 和 Sivaji Bandyopadhyay。2010.印度语言的 SentiWordNet。亚洲自然语言处理联合会,中国(2010),56-63。
Sanjiv Das and Mike Chen. 2001. Yahoo! for amazon: Extracting market sentiment from stock message boards. In Proceedings of the Asia Pacific Finance Association Annual Conference (APFA), Vol. 35. Bangkok, Thailand, 43.
Sanjiv Das 和 Mike Chen。2001.雅虎亚马逊:从股票留言板中提取市场情绪。In Proceedings of the Asia Pacific Finance Association Annual Conference (APFA), Vol. 35.泰国曼谷,43。
Kushal Dave, Steve Lawrence, and David M. Pennock. 2003. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In Proceedings of the 12th International Conference on World Wide Web. ACM, 519-528.
Kushal Dave、Steve Lawrence 和 David M. Pennock。2003.挖掘花生画廊:产品评论的观点提取和语义分类。第 12 届万维网国际会议论文集》。ACM,519-528。
Munmun De Choudhury, Michael Gamon, Scott Counts, and Eric Horvitz. 2013. Predicting depression via social media. In ICWSM.
Munmun De Choudhury、Michael Gamon、Scott Counts 和 Eric Horvitz。2013.通过社交媒体预测抑郁症。In ICWSM.
Kerstin Denecke and Yihan Deng. 2015. Sentiment analysis in medical settings: New opportunities and challenges. Artif. Intell. Med. 64, 1 (2015), 17-27.
Kerstin Denecke 和 Yihan Deng.2015.医疗环境中的情感分析:新的机遇与挑战。Artif.Intell.Med.64, 1 (2015), 17-27.
Sidney K. D'mello and Jacqueline Kory. 2015. A review and meta-analysis of multimodal affect detection systems. ACM Comput. Surv. 47, 3 (2015), 43.
Sidney K. D'mello and Jacqueline Kory.2015.多模态情感检测系统综述与荟萃分析》。ACM Comput.Surv.47, 3 (2015), 43.
Cícero Nogueira dos Santos and Maira Gatti. 2014. Deep convolutional neural networks for sentiment analysis of short texts. In COLING. 69-78.
Cícero Nogueira dos Santos 和 Maira Gatti。2014.用于短文情感分析的深度卷积神经网络。In COLING.69-78.
Paul Ekman. 1992. An argument for basic emotions. Cogn. Emot. 6, 3-4 (1992), 169-200.
保罗-埃克曼1992.An argument for basic emotions.Cogn.Emot.6, 3-4 (1992), 169-200.
Paul Ekman, Wallace V. Friesen, and Phoebe Ellsworth. 1972. Emotion in the human face: Guidelines for research and an integration of findings. New York. Permagon.
Paul Ekman、Wallace V. Friesen 和 Phoebe Ellsworth。1972.人脸上的情感:研究指南和研究成果集成》。纽约。Permagon。
Moataz El Ayadi, Mohamed S. Kamel, and Fakhri Karray. 2011. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recogn. 44, 3 (2011), 572-587.
Moataz El Ayadi, Mohamed S. Kamel, and Fakhri Karray.2011.语音情感识别调查:特征、分类方案和数据库。Pattern Recogn.44, 3 (2011), 572-587.
Andrea Esuli and Fabrizio Sebastiani. 2005. Determining the semantic orientation of terms through gloss classification. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management. ACM, 617-624.
Andrea Esuli and Fabrizio Sebastiani.2005.通过词汇分类确定术语的语义取向。第 14 届 ACM 信息与知识管理国际会议论文集》。ACM,617-624。
Andrea Esuli and Fabrizio Sebastiani. 2006a. Determining term subjectivity and term orientation for opinion mining. In , Vol. 6. 2006.
Andrea Esuli and Fabrizio Sebastiani.2006a.为意见挖掘确定术语主观性和术语取向。 , Vol. 6. 2006。
Andrea Esuli and Fabrizio Sebastiani. 2006b. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, Vol. 6. 417-422.
Andrea Esuli and Fabrizio Sebastiani.2006b.Sentiwordnet:用于意见挖掘的公开可用词汇资源。417-422.
Aidan Finn and Nicholas Kushmerick. 2006. Learning to classify documents according to genre. J. Am. Soc. Inf. Sci. Technol. 57, 11 (2006), 1506-1518.
Aidan Finn 和 Nicholas Kushmerick。2006.Learning to classify documents according to genre.J. Am.Soc. Inf.Sci.57, 11 (2006), 1506-1518.
Elaine Fox. 2008. Emotion Science Cognitive and Neuroscientific Approaches to Understanding Human Emotions. Palgrave Macmillan.
伊莱恩-福克斯2008.Emotion Science Cognitive and Neuroscientific Approaches to Understanding Human Emotions.Palgrave Macmillan.
Michael Gamon. 2004. Sentiment classification on customer feedback data: Noisy data, large feature vectors, and the role of linguistic analysis. In Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics, 841.
迈克尔-加蒙2004.客户反馈数据的情感分类:噪声数据、大型特征向量和语言分析的作用。In Proceedings of the 20th International Conference on Computational Linguistics.计算语言学协会,841。
Michael Gamon and Anthony Aue. 2005. Automatic identification of sentiment vocabulary: Exploiting low association with known sentiment terms. In Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing. Association for Computational Linguistics, 57-64.
Michael Gamon 和 Anthony Aue.2005.情感词汇的自动识别:利用与已知情感术语的低关联性。In Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing.计算语言学协会,57-64。
Bo Gao, Bettina Berendt, and Joaquin Vanschoren. 2015a. Who is more positive in private? Analyzing sentiment differences across privacy levels and demographic factors in facebook chats and posts. In Proceedings of the 2015 IEEE / ACM International Conference on Advances in Social Networks Analysis and Mining 2015. ACM, 605-610.
Bo Gao, Bettina Berendt, and Joaquin Vanschoren.2015a.谁在私下更积极?分析 facebook 聊天记录和帖子中不同隐私级别和人口因素的情感差异。In Proceedings of the 2015 IEEE / ACM International Conference on Advances in Social Networks Analysis and Mining 2015.ACM,605-610。
Kai Gao, Hua Xu, and Jiushuo Wang. 2015b. Emotion cause detection for chinese micro-blogs based on ECOCC model. In Advances in Knowledge Discovery and Data Mining. Springer, 3-14.
Kai Gao, Hua Xu, and Jiushuo Wang.2015b.基于ECOCC模型的中文微博情感原因检测。知识发现与数据挖掘研究进展》。Springer, 3-14.
Ameneh Gholipour Shahraki. 2015. Emotion Detection from Text. Master's thesis. University of Alberta.
Ameneh Gholipour Shahraki.2015.从文本中检测情感。硕士论文。阿尔伯塔大学。
Narendra Gupta, Mazin Gilbert, and Giuseppe Di Fabbrizio. 2013. Emotion detection in email customer care. Comput. Intell. 29, 3 (2013), 489-505.
Narendra Gupta、Mazin Gilbert 和 Giuseppe Di Fabbrizio。2013.电子邮件客户服务中的情感检测。Comput.Intell.29, 3 (2013), 489-505.
Jeffrey T. Hancock, Christopher Landrigan, and Courtney Silver. 2007. Expressing emotion in text-based communication. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 929-932.
Jeffrey T. Hancock, Christopher Landrigan, and Courtney Silver.2007.在基于文本的交流中表达情感。In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.ACM,929-932。
Maryam Hasan, Emmanuel Agu, and Elke Rundensteiner. 2014. Using hashtags as labels for supervised learning of emotions in Twitter messages. In Proceedings of the Health Informatics Workshop (HI-KDD).
Maryam Hasan、Emmanuel Agu 和 Elke Rundensteiner。2014.使用标签作为监督学习 Twitter 消息中情绪的标签。健康信息学研讨会(HI-KDD)论文集。
Vasileios Hatzivassiloglou and Kathleen R. McKeown. 1997. Predicting the semantic orientation of adjectives. In Proceedings of the 8th Conference on European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 174-181.
Vasileios Hatzivassiloglou and Kathleen R. McKeown.1997.预测形容词的语义取向。In Proceedings of the 8th Conference on European Chapter of the Association for Computational Linguistics.计算语言学协会,174-181。
Vasileios Hatzivassiloglou and Janyce M. Wiebe. 2000. Effects of adjective orientation and gradability on sentence subjectivity. In Proceedings of the 18th Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 299-305.
Vasileios Hatzivassiloglou and Janyce M. Wiebe.2000.形容词方位和渐变性对句子主观性的影响。In Proceedings of the 18th Conference on Computational Linguistics-Volume 1.计算语言学协会,299-305。
Dung T. Ho and Tru H. Cao. 2012. A high-order hidden Markov model for emotion detection from textual data. In Knowledge Management and Acquisition for Intelligent Systems. Springer, 94-105.
Dung T. Ho 和 Tru H. Cao.Cao.2012.从文本数据中进行情感检测的高阶隐马尔可夫模型。In Knowledge Management and Acquisition for Intelligent Systems.Springer, 94-105.
Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 168-177.
Minqing Hu and Bing Liu.2004.挖掘和总结客户评论。第 10 届 ACM SIGKDD 知识发现与数据挖掘国际会议论文集》。ACM,168-177。
Xia Hu, Jiliang Tang, Huiji Gao, and Huan Liu. 2013. Unsupervised sentiment analysis with emotional signals. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 607-618.
Xia Hu, Jiliang Tang, Huiji Gao, and Huan Liu.2013.利用情感信号的无监督情感分析。第 22 届万维网国际会议论文集》。国际万维网会议指导委员会,607-618。
Mukesh Jain and V. Kulkarni. 2014. TexEmo: Conveying emotion from text-the study. Int. J. Comput. Appl. 86,4 (2014), .
Mukesh Jain 和 V. Kulkarni。2014.TexEmo:从文本传递情感--研究。Int.86,4 (2014), .86,4 (2014), .
Jing Jiang and ChengXiang Zhai. 2007. Instance weighting for domain adaptation in NLP. In ACL, Vol. 7. Citeseer, 264-271.
Jing Jiang and ChengXiang Zhai.2007.NLP 领域适应的实例加权。In ACL, Vol. 7. Citeseer, 264-271.
Nitin Jindal and Bing Liu. 2008. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM, 219-230.
Nitin Jindal 和 Bing Liu.2008.垃圾观点与分析》。2008 年网络搜索与数据挖掘国际会议论文集》。ACM,219-230。
Yohan Jo and Alice H. Oh. 2011. Aspect and sentiment unification model for online review analysis. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 815-824.
Yohan Jo and Alice H. Oh.用于在线评论分析的特征和情感统一模型。第四届 ACM 网络搜索与数据挖掘国际会议论文集》。ACM,815-824。
Nobuhiro Kaji and Masaru Kitsuregawa. 2006. Automatic construction of polarity-tagged corpus from HTML documents. In Proceedings of the COLING/ACL on Main Conference Poster Sessions. Association for Computational Linguistics, 452-459.
Nobuhiro Kaji and Masaru Kitsuregawa.2006.从 HTML 文档自动构建极性标记语料库》。In Proceedings of the COLING/ACL on Main Conference Poster Sessions.计算语言学协会,452-459。
Nobuhiro Kaji and Masaru Kitsuregawa. 2007. Building lexicon for sentiment analysis from massive collection of HTML documents.. In EMNLP-CoNLL. Citeseer, 1075-1083.
Nobuhiro Kaji and Masaru Kitsuregawa.2007.从海量 HTML 文档中构建情感分析词典》。在 EMNLP-CoNLL 中。Citeseer,1075-1083。
Jaap Kamps, M. J. Marx, Robert J. Mokken, and Maarten De Rijke. 2004. Using wordnet to measure semantic orientations of adjectives. Language Resources and Evaluation Conference (LREC) 4 (2004), 1115-1118.
Jaap Kamps, M. J. Marx, Robert J. Mokken, and Maarten De Rijke.2004.使用词网测量形容词的语义取向。语言资源与评估会议(LREC)4(2004),1115-1118。
Daekook Kang and Yongtae Park. 2014. Review-based measurement of customer satisfaction in mobile service: Sentiment analysis and VIKOR approach. Expert Syst. Appl. 41, 4 (2014), 1041-1050.
Daekook Kang and Yongtae Park.2014.基于评论的移动服务客户满意度测量:情感分析和VIKOR方法。Expert Syst.41, 4 (2014), 1041-1050.
E. C.-C. Kao, Chun-Chieh Liu, Ting-Hao Yang, Chang-Tai Hsieh, and Von-Wun Soo. 2009. Towards textbased emotion detection a survey and possible improvements. In Proceedings of the 2009 International Conference on Information Management and Engineering (ICIME'09). IEEE, 70-74.
E.C.-C.Kao, Chun-Chieh Liu, Ting-Hao Yang, Chang-Tai Hsieh, and Von-Wun Soo.2009.基于文本的情感检测调查与可能的改进》。2009 年信息管理与工程国际会议(ICIME'09)论文集。IEEE,70-74。
Phil Katz, Matthew Singleton, and Richard Wicentowski. 2007. Swat-mp: The semeval-2007 systems for task 5 and task 14. In Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics, 308-313.
Phil Katz、Matthew Singleton 和 Richard Wicentowski。2007.Swat-mp:任务5和任务14的semeval-2007系统。语义评估第四届国际研讨会论文集》。计算语言学协会,308-313。
Sunghwan Mac Kim, Alessandro Valitutti, and Rafael A Calvo. 2010. Evaluation of unsupervised emotion models to textual affect recognition. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. Association for Computational Linguistics, 62-70.
Sunghwan Mac Kim、Alessandro Valitutti 和 Rafael A Calvo。2010.评估文本情感识别的无监督情感模型。In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text.计算语言学协会,62-70。
Svetlana Kiritchenko, Xiaodan Zhu, Colin Cherry, and Saif Mohammad. 2014. NRC-Canada-2014: Detecting aspects and sentiment in customer reviews. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 437-442.
Svetlana Kiritchenko、Xiaodan Zhu、Colin Cherry 和 Saif Mohammad。2014.NRC-Canada-2014:检测客户评论中的方面和情感。第八届语义评估(SemEval 2014)国际研讨会论文集》。计算语言学协会和爱尔兰都柏林城市大学,爱尔兰都柏林,437-442。
Andrea Kleinsmith and Nadia Bianchi-Berthouze. 2013. Affective body expression perception and recognition: A survey. IEEE Trans. Affect. Comput. 4, 1 (2013), 15-33.
Andrea Kleinsmith and Nadia Bianchi-Berthouze.2013.情感性肢体表情感知与识别:调查。IEEE Trans.Affect.Comput.4, 1 (2013), 15-33.
Sophia Yat Mei Lee, Ying Chen, Shoushan Li, and Chu-Ren Huang. 2010. Emotion cause events: Corpus construction and analysis. In .
Sophia Yat Mei Lee, Ying Chen, Shoushan Li, and Chu-Ren Huang.2010.情感原因事件:语料库构建与分析。见