Learning to Understand Information on the Internet: AnExample-Based Approach

  • Authors:
  • Mike Perkowitz;Robert B. Doorenbos;Oren Etzioni;Daniel S. Weld

  • Affiliations:
  • Department of Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195-2350. E-mail: map@cs.washington.edu, bobd@cs.washington.edu, etzioni@cs.washington.edu, we ...;Department of Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195-2350. E-mail: map@cs.washington.edu, bobd@cs.washington.edu, etzioni@cs.washington.edu, we ...;Department of Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195-2350. E-mail: map@cs.washington.edu, bobd@cs.washington.edu, etzioni@cs.washington.edu, we ...;Department of Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195-2350. E-mail: map@cs.washington.edu, bobd@cs.washington.edu, etzioni@cs.washington.edu, we ...

  • Venue:
  • Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

The explosive growth of the Web has made intelligent softwareassistants increasingly necessary for ordinary computer users. Bothtraditional approaches—search engines, hierarchical indices—andintelligent software agents require significant amounts of humaneffort to keep up with the Web. As an alternative, we investigate theproblem of automatically learning to interact with informationsources on the Internet. We report on ShopBotand ILA , two implemented agents that learn touse such resources. ShopBot learns how to extract information from onlinevendors using only minimal knowledge about product domains. Giventhe home pages of several online stores, ShopBotautonomously learns how to shop at those vendors. After its learningis complete, ShopBot is able to speedily visitover a dozen software stores and CD vendors, extract productinformation, and summarize the results for the user. ILAlearns to translate information from Internetsources into its own internal concepts. ILAbuilds a model of an information source that specifies the translation between the source‘s output and ILA ‘s model of the world. ILA iscapable of leveraging a small amount of knowledge about a domain tolearn models of many information sources. We show that ILA ‘s learning is fast and accurate, requiring only a smallnumber of queries per information source.