Compiling document collections from the Internet

  • Authors:
  • V. Kluev

  • Affiliations:
  • The Core and Information Technology Center, The University of Aizu, Tsuruga, Ikki-machi Aizu-Wakamatsu City, Fukushima, 965-8580, Japan

  • Venue:
  • ACM SIGIR Forum
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Presently domain specific search engines are becoming popular because they offer greater accuracy, when compared to general purpose search engines. In this study, a method for collecting domain specific documents from the net was developed for the purpose of improving search results. The main thrust of our approach is to use several metrics to estimate the relevance of every automatically discovered document by a crawler regarding a topic of interest. This type of search resulted in two important findings. First, the time required for manual analysis of document content by the crawler was significantly reduced; second, the content quality of selected documents was improved. These results suggest that the rough estimation of precision and recall calculated in this study offer great promise.