Social event detection on twitter

Authors:
Elena Ilina;Claudia Hauff;Ilknur Celik;Fabian Abel;Geert-Jan Houben
Affiliations:
Web Information Systems, Delft University of Technology, The Netherlands;Web Information Systems, Delft University of Technology, The Netherlands;Middle East Technical University, Turkey;Web Information Systems, Delft University of Technology, The Netherlands;Web Information Systems, Delft University of Technology, The Netherlands
Venue:
ICWE'12 Proceedings of the 12th international conference on Web Engineering
Year:
2012

Citing 7
Cited 1

Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Why we twitter: understanding microblogging usage and communities

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
TwitterStand: news in tweets

Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Earthquake shakes Twitter users: real-time event detection by social sensors

Proceedings of the 19th international conference on World wide web
Extracting events and event descriptions from Twitter

Proceedings of the 20th international conference companion on World wide web
Event discovery in social media feeds

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Leveraging the semantics of tweets for adaptive faceted search on twitter

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I

Evidential location estimation for events detected in Twitter

Proceedings of the 7th Workshop on Geographic Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Various applications are developed today on top of microblogging services like Twitter. In order to engineer Web applications which operate on microblogging data, there is a need for appropriate filtering techniques to identify messages. In this paper, we focus on detecting Twitter messages (tweets) that report on social events. We introduce a filtering pipeline that exploits textual features and n-grams to classify messages into event related and non-event related tweets. We analyze the impact of preprocessing techniques, achieving accuracies higher than 80%. Further, we present a strategy to automate labeling of training data, since our proposed filtering pipeline requires training data. When testing on our dataset, this semi-automated method achieves an accuracy of 79% and results comparable to the manual labeling approach.