Statistical approach to estimate the quality of web datasets

  • Authors:
  • Vitaly Klyuev

  • Affiliations:
  • Software Engineering Laboratory, University of Aizu, Aizu-Wakamatsu City, Fukushima, Japan

  • Venue:
  • CIMMACS'05 Proceedings of the 4th WSEAS international conference on Computational intelligence, man-machine systems and cybernetics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding appropriate information on the Web is getting more difficult with inefficient tools currently being used on the net. Using a topicspecific approach to build crawlers is promising. In this paper, we discuss a technique using methods of statistical analysis to evaluate the quality of the crawled documents. We have found this technique is more robust, more reliable, more practical and less subjective compared to others.