Understanding factors that affect response rates in twitter

  • Authors:
  • Giovanni Comarela;Mark Crovella;Virgilio Almeida;Fabricio Benevenuto

  • Affiliations:
  • Federal University of Minas Gerais, Belo Horizonte, Brazil;Boston University, Boston, MA, USA;Federal University of Minas Gerais, Belo Horizonte, Brazil;Federal University of Ouro Preto, Ouro Preto, Brazil

  • Venue:
  • Proceedings of the 23rd ACM conference on Hypertext and social media
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In information networks where users send messages to one another, the issue of information overload naturally arises: which are the most important messages? In this paper we study the problem of understanding the importance of messages in Twitter. We approach this problem in two stages. First, we perform an extensive characterization of a very large Twitter dataset which includes all users, social relations, and messages posted from the beginning of the service up to August 2009. We show evidence that information overload is present: users sometimes have to search through hundreds of messages to find those that are interesting to reply or retweet. We then identify factors that influence user response or retweet probability: previous responses to the same tweeter, the tweeter's sending rate, the age and some basic text elements of the tweet. In our second stage, we show that some of these factors can be used to improve the presentation order of tweets to the user. First, by inspecting user activity over time, we construct a simple on-off model of user behavior that allows us to infer when a user is actively using Twitter. Then, we explore two methods from machine learning for ranking tweets: a Naive Bayes predictor and a Support Vector Machine classifier. We show that it is possible to reorder tweets to increase the fraction of replied or retweeted messages appearing in the first p positions of the list by as much as 50-60%.