Timespent based models for predicting user retention

  • Authors:
  • Kushal S. Dave;Vishal Vaingankar;Sumanth Kolar;Vasudeva Varma

  • Affiliations:
  • Inteernational Institute of Information Technology Hyderabad, Hyderabad, India;StumbleUpon, San Francisco, CA, USA;StumbleUpon, San Francisco, CA, USA;Inteernational Institute of Information Technology Hyderabad, Hyderabad, India

  • Venue:
  • Proceedings of the 22nd international conference on World Wide Web
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Content discovery is fast becoming the preferred tool for user engagement on the web. Discovery allows users to get educated and entertained about their topics of interest. StumbleUpon is the largest personalized content discovery engine on the Web, delivering more than 1 billion personalized recommendations per month. As a recommendation system one of the primary metrics we track is whether the user returns (retention) to use the product after their initial experience (session) with StumbleUpon. In this paper, we attempt to address the problem of predicting user retention based on the user's previous sessions. The paper first explores the different user and content features that are helpful in predicting user retention. This involved mapping the user and the user's recommendations (stumbles) in a descriptive feature space such as the time-spent by user, number of stumbles, and content features of the recommendations. To model the diversity in user behaviour, we also generated normalized features that account for the user's speed of stumbling. Using these features, we built a decision tree classifier to predict retention. We find that a model that uses both the user and content features achieves higher prediction accuracy than a model that uses the two features separately. Further, we used information theoretical analysis to find a subset of recommendations that are most indicative of user retention. A classifier trained on this subset of recommendations achieves the highest prediction accuracy. This indicates that not every recommendation seen by the user is predictive of whether the user will be retained; instead, a subset of most informative recommendations is more useful in predicting retention.