Statistical measure of quality in Wikipedia

  • Authors:
  • Sara Javanmardi;Cristina Lopes

  • Affiliations:
  • University of California, Irvine;University of California, Irvine

  • Venue:
  • Proceedings of the First Workshop on Social Media Analytics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Wikipedia is commonly viewed as the main online encyclopedia. Its content quality, however, has often been questioned due to the open nature of its editing model. A high--quality contribution by an expert may be followed by a low-quality contribution made by an amateur or a vandal; therefore the quality of each article may fluctuate over time as it goes through iterations of edits by different users. With the increasing use of Wikipedia, the need for a reliable assessment of the quality of the content is also rising. In this study, we model the evolution of content quality in Wikipedia articles in order to estimate the fraction of time during which articles retain high-quality status. To evaluate the model, we assess the quality of Wikipedia's featured and non-featured articles. We show how the model reproduces consistent results with what is expected. As a case study, we use the model in a CalSWIM mashup the content of which is taken from both highly reliable sources and Wikipedia, which may be less so. Integrating CalSWIM with a trust management system enables it to use not only recency but also quality as its criteria, and thus filter out vandalized or poor-quality content.