Incremental clustering for profile maintenance in information gathering web agents

  • Authors:
  • Gabriel L. Somlo;Adele E. Howe

  • Affiliations:
  • Computer Science Department, Colorado State University, Fort Collins, CO;Computer Science Department, Colorado State University, Fort Collins, CO

  • Venue:
  • Proceedings of the fifth international conference on Autonomous agents
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

User profiles are the central component of most personalized Web information agents. They consist of a set of models representing the various topics of interest to the user. Often the agent learns the user's preferences from examples of documents deemed relevant to the user. The topic of the document can either be supplied by the user (active modeling), or it must be guessed by the agent (passive modeling), which is more convenient but is expected to diminish the agent's accuracy. We present an empirical study assessing the trade-offs in passive versus active document classification. We compare a manual profile maintenance technique in which the user supplies the document topic, and two incremental clustering methods (greedy and the doubling algorithm) for automated maintenance of the user profile components. The study is performed using our SurfAgent, a testbed information gathering Web agent. Our evaluation methodology exploits the strong parallel between Web information agents and text filtering; we use text filtering benchmarks from the information retrieval community (TREC disk \#5) to simulate user behavior and thus speed up data collection, exert additional experimental control and improve the objectivity of our results.