Harnessing Twitter "Big Data" for Automatic Emotion Identification

Authors:
Wenbo Wang;Lu Chen;Krishnaprasad Thirunarayan;Amit P. Sheth
Affiliations:
-;-;-;-
Venue:
SOCIALCOM-PASSAT '12 Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust
Year:
2012

Citing 0
Cited 5

Distant supervision for emotion classification with discrete binary values

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Real-time emotion classification of Tweets

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Social structure and depression in TrevorSpace

Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
Cursing in English on twitter

Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
Recognizing and regulating e-learners' emotions based on interactive Chinese texts in e-learning systems

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

User generated content on Twitter (produced at an enormous rate of 340 million tweets per day) provides a rich source for gleaning people's emotions, which is necessary for deeper understanding of people's behaviors and actions. Extant studies on emotion identification lack comprehensive coverage of "emotional situations" because they use relatively small training datasets. To overcome this bottleneck, we have automatically created a large emotion-labeled dataset (of about 2.5 million tweets) by harnessing emotion-related hash tags available in the tweets. We have applied two different machine learning algorithms for emotion identification, to study the effectiveness of various feature combinations as well as the effect of the size of the training data on the emotion identification task. Our experiments demonstrate that a combination of unigrams, big rams, sentiment/emotion-bearing words, and parts-of-speech information is most effective for gleaning emotions. The highest accuracy (65.57%) is achieved with a training data containing about 2 million tweets.