Syskill & webert: Identifying interesting web sites

  • Authors:
  • Michael Pazzani;Jack Muramatsu;Daniel Billsus

  • Affiliations:
  • Department of Information and Computer Science, University of California, Irvine, Irvine, CA;Department of Information and Computer Science, University of California, Irvine, Irvine, CA;Department of Information and Computer Science, University of California, Irvine, Irvine, CA

  • Venue:
  • AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe Syskill & Webert, a software agent that learns to rate pages on the World Wide Web (WWW), deciding what pages might interest a user. The user rates explored pages on a three point scale, and Syskill & Webert learns a user profile by analyzing the information on each page. The user profile can be used in two ways. First, it can be used to suggest which links a user would be interested in exploring. Second, it can be used to construct a LYCOS query to find pages that would interest a user. We compare six different algorithms from machine learning and information retrieval on this task. We find that the naive Bayesian classifier offers several advantages over other learning algorithms on this task. Furthermore, we find that an initial portion of a web page is sufficient for making predictions on its interestingness substantially reducing the amount of network transmission required to make predictions.