Early online identification of attention gathering items in social media

Authors:
Michael Mathioudakis;Nick Koudas;Peter Marbach
Affiliations:
University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada
Venue:
Proceedings of the third ACM international conference on Web search and data mining
Year:
2010

Citing 9
Cited 6

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
SALSA: the stochastic approach for link-structure analysis

ACM Transactions on Information Systems (TOIS)
Bursty and hierarchical structure in streams

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Parameter free bursty events detection in text streams

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Data association for topic intensity tracking

ICML '06 Proceedings of the 23rd international conference on Machine learning
Time-dependent event hierarchy construction

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Seeking stable clusters in the blogosphere

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Temporal and information flow based event detection from social text streams

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2

Data streaming 2.0

Communications of the ACM
Crowds, clouds, and algorithms: exploring the human side of "big data" applications

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Linking online news and social media

Proceedings of the fourth ACM international conference on Web search and data mining
EnBlogue: emergent topic detection in web 2.0 streams

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Coevolution of network structure and content

Proceedings of the 3rd Annual ACM Web Science Conference
Identifying event-related bursts via social media activities

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.02

Visualization

Abstract

Activity in social media such as blogs, micro-blogs, social networks, etc is manifested via interaction that involves text, images, links and other information items. Naturally, some items attract more attention than others, expressed with large volumes of linking, commenting or tagging activity, to name a few examples. Moreover, high attention can be indicative of emerging events, breaking news or generally indicate information items of interest to a vast set of people. The numbers associated with digital social activity are astonishing: in excess of millions of blog posts, tweets and forums updates per day, millions of tags in photos, news articles or blogs. Being able to identify information items that gather much attention in such a real time information collective is a challenging task. In this paper, we consider the problem of early online identification of items that gather a lot of attention in social media. We model social media activity using ISIS, a stochastic model for Interacting Streaming Information Sources, that intuitively captures the concept of attention gathering information items. Given the challenge of the information overload characterizing digital social activity, we present sequential statistical tests that enable early identification of attention gathering items. This effectively reduces the set of items one has to monitor in real time in order to identify pieces of information attracting a lot of attention. Experiments on real data demonstrate the utility of our model, as well as the efficiency and effectiveness of the proposed sequential statistical tests.