Automatically Mining Result Records from Search Engine Response Pages

  • Authors:
  • Dheerendranath Mundluru;Jayasimha Reddy Katukuri;Saygin Celebi

  • Affiliations:
  • University of Louisiana at Lafayette;University of Louisiana at Lafayette;University of Louisiana at Lafayette

  • Venue:
  • ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Usually, Web applications such as deep Web crawlers, metasearch engines, and other Web mining systems need to extract information displayed in the form of result records on response pages returned by search engines in response to submitted queries. Extracting such records is challenging as search engines are heterogeneous in displaying their records. In addition, response pages returned by many search engines include other noisy content such as advertisements, suggestion links, etc., which make the extraction task even more complicated. In this paper, we propose a highly effective and efficient algorithm for automatically mining result records from search engine response pages.