Discovering and Analyzing World Wide Web Collections

  • Authors:
  • Sougata Mukherjea

  • Affiliations:
  • India Research Lab, IBM, India

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the explosive growth of the World Wide Web, it is becoming increasingly difficult for users to discover Web pages that are relevant to a topic. To address this problem we are developing a system that allows the collection and analysis of Web pages related to a particular topic. In this paper we present the system’s overall architecture and introduce the focused crawler used by the system. We also discuss the various techniques we use to allow the user to analyze and gain useful insights about a collection. Finally, we present some statistics on the collections.