Mirroring the real world in social media: twitter, geolocation, and sentiment analysis

Authors:
Eric Baucom;Azade Sanjari;Xiaozhong Liu;Miao Chen
Affiliations:
Indiana University, Bloomington, IN, USA;Indiana University, Bloomington, IN, USA;Indiana University, Bloomington, IN, USA;Syracuse University, Syracuse, NY, USA
Venue:
Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing
Year:
2013

Citing 11
Cited 0

Predicting the semantic orientation of adjectives

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A spelling correction program based on a noisy channel model

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An improved error model for noisy channel spelling correction

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Opinion mining from noisy text data

International Journal on Document Analysis and Recognition - Special Issue NOISY
Earthquake shakes Twitter users: real-time event detection by social sensors

Proceedings of the 19th international conference on World wide web
Situation detection and control using spatio-temporal analysis of microblogs

Proceedings of the 19th international conference on World wide web
Sentiment analysis of Twitter data

LSM '11 Proceedings of the Workshop on Languages in Social Media

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years social media has been used to characterize and predict real world events, and in this research we seek to investigate how closely Twitter mirrors the real world. Specifically, we wish to characterize the relationship between the language used on Twitter and the results of the 2011 NBA Playoff games. We hypothesize that the language used by Twitter users will be useful in classifying the users' locations combined with the current status of which team is in the lead during the game. This is based on the common assumption that "fans" of a team have more positive sentiment and will accordingly use different language when their team is doing well. We investigate this hypothesis by labeling each tweet according the the location of the user along with the team that is in the lead at the time of the tweet. The hypothesized difference in language (as measured by tfidf) should then have predictive power over the tweet labels. We find that indeed it does and we experiment further by adding semantic orientation (SO) information as part of the feature set. The SO does not offer much improvement over tf-idf alone. We discuss the relative strengths of the two types of features for our data.