Efficient filtering in micro-blogging systems: we won't get flooded again

  • Authors:
  • Ryadh Dahimene;Cedric Du Mouza;Michel Scholl

  • Affiliations:
  • CEDRIC Laboratory, CNAM, Paris, France;CEDRIC Laboratory, CNAM, Paris, France;CEDRIC Laboratory, CNAM, Paris, France

  • Venue:
  • SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the last years, micro-blogging systems have encountered a large success. Twitter for instance claims more than 200 million accounts after 5 years of existence and a daily traffic of more than 200 million tweets leading to 350 billion delivered tweets. Micro-blogging systems rely on the all-or-nothing paradigm: a user receives all the posts from an account he follows. A consequence for a user is the risk of flooding, i.e., the number of posts received implies a time-consuming scan of his list of postings to read news that match his interests. To avoid user flooding and to significantly diminish the number of posts to be delivered, we propose a filtering structure for micro-blogging systems. We present an analytical model and an experimental study on synthetical datasets and on a real Twitter dataset which consists of more than 2.1 million users, 15.7 million tweets and 148.5 million publisher-follower relationships.