MySpiders: Evolve Your Own Intelligent Web Crawlers

  • Authors:
  • Gautam Pant;Filippo Menczer

  • Affiliations:
  • Department of Management Sciences, The University of Iowa, Iowa City, IA 52242 gautam-pant@uiowa.edu;Department of Management Sciences, The University of Iowa, Iowa City, IA 52242 filippo-menczer@uiowa.edu

  • Venue:
  • Autonomous Agents and Multi-Agent Systems
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The dynamic nature of the World Wide Web makes it a challenge to find information that is both relevant and recent. Intelligent agents can complement the power of search engines to meet this challenge. We present a Web tool called MySpiders, which implements an evolutionary algorithm managing a population of adaptive crawlers who browse the Web autonomously. Each agent acts as an intelligent client on behalf of the user, driven by a user query and by textual and linkage clues in the crawled pages. Agents autonomously decide which links to follow, which clues to internalize, when to spawn offspring to focus the search near a relevant source, and when to starve. The tool is available to the public as a threaded Java applet. We discuss the development and deployment of such a system.