Learning from history: predicting reverted work at the word level in wikipedia

  • Authors:
  • Jeffrey Rzeszotarski;Aniket Kittur

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, Pennsylvania, USA;Carnegie Mellon University, Pittsburgh, Pennsylvania, USA

  • Venue:
  • Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Wikipedia's remarkable success in aggregating millions of contributions can pose a challenge for current editors, whose hard work may be reverted unless they understand and follow established norms, policies, and decisions and avoid contentious or proscribed terms. We present a machine learning model for predicting whether a contribution will be reverted based on word level features. Unlike previous models relying on editor-level characteristics, our model can make accurate predictions based only on the words a contribution changes. A key advantage of the model is that it can provide feedback on not only whether a contribution is likely to be rejected, but also the particular words that are likely to be controversial, enabling new forms of intelligent interfaces and visualizations. We examine the performance of the model across a variety of Wikipedia articles.