Web article quality assessment in multi-dimensional space

  • Authors:
  • Jingyu Han;Xiong Fu;Kejia Chen;Chuandong Wang

  • Affiliations:
  • School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, China;School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, China;School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, China;School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, China

  • Venue:
  • WAIM'11 Proceedings of the 12th international conference on Web-age information management
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Nowadays user-generated content (UGC) such as Wikipedia, is emerging on the web at an explosive rate, but its data quality varies dramatically. How to effectively rate the article's quality is the focus of research and industry communities. Considering that each quality class demonstrates its specific characteristics on different quality dimensions, we propose to learn the web quality corpus by taking different quality dimensions into consideration. Each article is regarded as an aggregation of sections and each section's quality is modelled using Dynamic Bayesian Network(DBN) with reference to accuracy, completeness and consistency. Each quality class is represented by three dimension corpora, namely accuracy corpus, completeness corpus and consistency corpus. Finally we propose two schemes to compute quality ranking. Experiments show our approach performs well.