Using Stochastic Models to Describe and Predict Social Dynamics of Web Users

  • Authors:
  • Kristina Lerman;Tad Hogg

  • Affiliations:
  • USC Information Sciences Institute;Institute for Molecular Manufacturing

  • Venue:
  • ACM Transactions on Intelligent Systems and Technology (TIST)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The popularity of content in social media is unequally distributed, with some items receiving a disproportionate share of attention from users. Predicting which newly-submitted items will become popular is critically important for both the hosts of social media content and its consumers. Accurate and timely prediction would enable hosts to maximize revenue through differential pricing for access to content or ad placement. Prediction would also give consumers an important tool for filtering the content. Predicting the popularity of content in social media is challenging due to the complex interactions between content quality and how the social media site highlights its content. Moreover, most social media sites selectively present content that has been highly rated by similar users, whose similarity is indicated implicitly by their behavior or explicitly by links in a social network. While these factors make it difficult to predict popularity a priori, stochastic models of user behavior on these sites can allow predicting popularity based on early user reactions to new content. By incorporating the various mechanisms through which web sites display content, such models improve on predictions that are based on simply extrapolating from the early votes. Specifically, for one such site, the news aggregator Digg, we show how a stochastic model distinguishes the effect of the increased visibility due to the network from how interested users are in the content. We find a wide range of interest, distinguishing stories primarily of interest to users in the network (“niche interests”) from those of more general interest to the user community. This distinction is useful for predicting a story’s eventual popularity from users’ early reactions to the story.