Related, but not Relevant: Content-Based Collaborative Filtering in TREC-8

  • Authors:
  • Ian M. Soboroff;Charles K. Nicholas

  • Affiliations:
  • Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County. ian@cs.umbc.edu;Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County. nicholas@cs.umbc.edu

  • Venue:
  • Information Retrieval
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Historically, solutions to the TREC filtering tasks have focused exclusively on the content of documents and search topic descriptions as training data. These approaches are well-known for their ability to focus on those salient concepts in the document stream which are most useful for separating relevant documents from irrelevant ones. However, one kind of information that has not been used is the relationships among the topics themselves. In our TREC-8 routing experiments, we employed a collaborative (or social) filtering algorithm, based on latent semantic indexing which highlights common term usage patterns among groups of filtering profiles. Our hypothesis was that this would allow related topics to share common relevant documents. We found, however, that the algorithm also recommends many documents of related, yet irrelevant interest. As a result of this process, many similar search topics are “linked” together by common sets of documents recommended to them. We visualize these topic relationships using graphs where topics are nodes and edges exist where two topics share a recommended document.