Learning from history: predicting reverted work at the word level in wikipedia

Authors:
Jeffrey Rzeszotarski;Aniket Kittur
Affiliations:
Carnegie Mellon University, Pittsburgh, Pennsylvania, USA;Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Venue:
Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
Year:
2012

Citing 8
Cited 3

He says, she says: conflict and coordination in Wikipedia

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Creating, destroying, and restoring value in wikipedia

Proceedings of the 2007 international ACM conference on Supporting group work
Us vs. Them: Understanding Social Dynamics in Wikipedia with Revert Graph Visualizations

VAST '07 Proceedings of the 2007 IEEE Symposium on Visual Analytics Science and Technology
rv you're dumb: identifying discarded work in Wiki article history

Proceedings of the 5th International Symposium on Wikis and Open Collaboration
The singularity is not near: slowing growth of Wikipedia

Proceedings of the 5th International Symposium on Wikis and Open Collaboration
A jury of your peers: quality, experience and ownership in Wikipedia

Proceedings of the 5th International Symposium on Wikis and Open Collaboration
Automatic vandalism detection in Wikipedia

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Don't bite the newbies: how reverts affect the quantity and quality of Wikipedia work

Proceedings of the 7th International Symposium on Wikis and Open Collaboration

Staying in the loop: structure and dynamics of Wikipedia's breaking news collaborations

Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration
What aren't we measuring?: methods for quantifying wiki-work

Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration
Trust, but verify: predicting contribution quality for knowledge base construction and curation

Proceedings of the 7th ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Wikipedia's remarkable success in aggregating millions of contributions can pose a challenge for current editors, whose hard work may be reverted unless they understand and follow established norms, policies, and decisions and avoid contentious or proscribed terms. We present a machine learning model for predicting whether a contribution will be reverted based on word level features. Unlike previous models relying on editor-level characteristics, our model can make accurate predictions based only on the words a contribution changes. A key advantage of the model is that it can provide feedback on not only whether a contribution is likely to be rejected, but also the particular words that are likely to be controversial, enabling new forms of intelligent interfaces and visualizations. We examine the performance of the model across a variety of Wikipedia articles.