Web mining; Applying fine-to-course sentiment analysis framework to emotion detection

From Master Projects
Jump to: navigation, search

has title::Applying Crowd-Sourcing in Affect Analysis of Dutch Social Media
status: ongoing
Master: project within::Knowledge Technology and Intelligent Internet Applications
Student name: student name::Laurens Rietveld
Start start date:=2010/02/01
End end date:=2010/06/30
Supervisor: Stefan Schlobach
Company: has company::GfK Daphne
Poster: has poster::Media:Media-AffectAnalysislrd500.pdf

Signature supervisor



The internet brought us (among other things) a platform to participate in online communities, and to voice our opinions. Together with this development, came the research into mining such opinions (sentiment analysis). This was a new area of research, as before the internet there was no comparable medium with the same vast amount of data containing opinions.
In sentiment analysis (also called opinion analysis) one focuses on retrieving the subjective message (the opinion) from a text. Regularly the text is classified in two or three classes: positive, negative and sometimes neutral [1]. An extension to the research of sentiment analysis is that of emotion detection. The regular sentiment analysis determines whether a message is just positive or negative, where emotion detection goes a step further by determining to which type of emotion the text belongs. A list of basic emotion determined by Izard [2] is:

  • Anger
  • Disgust
  • Fear
  • Guilt
  • Interest
  • Joy
  • Sadness
  • Shame
  • Surprise

In some applications a regular positive / negative sentiment analysis is enough, where other applications require a more thorough analysis of the text and require emotion detection. In the case of market research, the emotion detection approach would be of interest, as it delivers more comprehensive information. Retrieving useful information from a piece of text is by no means straight forward. The unstructured nature of natural language makes analyzing the content a difficult task. The regular information retrieval approach uses document frequency of terms to determine the class for this piece of text. This approach does not take into account the sentence and document structure of the text. Consider the following example [3]: “This is the first Mp3 player that I have used ... I thought it sounded great ... After only a few weeks, it started having trouble with the earphone connection ... I won’t be buying another.”

Using the document frequency of terms to analyze this text will probably result in a false positive. A different approach for this problem is that of the fine-to-coarse sentiment analysis framework [3]. This framework iteratively analyzes the whole document, each sentence separately, and the sentence structure. The result of such a framework should provide a more thorough evaluation of the text. The fine-to-coarse framework has been tried for regular sentiment analysis tasks where the classes were either positive or negative. It has not been tried yet at an emotion detection task though.

In this research, I will study how to combine the fine-to-coarse sentiment analysis framework with an emotion detection task. The datasets used in this research will be taken from blogs and forums off the internet. I will apply my work in the marketing research company GfK Daphne.

[1] Pang, B. Lee, L. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), pp. 1–135, 2008
[2] Izard, C.E. Human emotions. NY: Plenum Press, 1977.
[3] McDonald, R. Structured models for fine-to-coarse sentiment analysis. Proceedings of the Annual Conference of the Association for Computation Acknowledgements Linguistics (ACL), 2007