Assessing web article quality by harnessing collective intelligence

  • Authors:
  • Jingyu Han;Xueping Chen;Kejia Chen;Dawei Jiang

  • Affiliations:
  • School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, P.R.China;School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, P.R.China;School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, P.R.China;School of Computing, National University of Singapore, Singapore

  • Venue:
  • DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing approaches assess web article's quality mainly based on syntax, but seldom work is given on how to quantify its quality based on semantics. In this paper we propose a novel Semantic Quality Assessment(SQA) approach to automatically determine data quality in terms of two most important quality dimensions, namely accuracy and completeness. First, alternative context with respect to source article is built by collecting alternative web articles. Second, each alternative article is transformed and represented by semantic corpus and dimension baselines are synthetically generated from these semantic corpora. Finally, quality dimension of source article is determined by comparing its semantic corpus with dimension baseline. Our approach is promising way to assess web article quality by exploiting available collective knowledge. Experiments show that our approach performs well.