Evaluation of find-similar with simulation and network analysis

  • Authors:
  • James Allan;Mark D. Smucker

  • Affiliations:
  • University of Massachusetts Amherst;University of Massachusetts Amherst

  • Venue:
  • Evaluation of find-similar with simulation and network analysis
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Every day, people use information retrieval (IR) systems to find documents that satisfy their information needs. Even though IR has revolutionized the way people find information, IR systems can still fail to satisfy people's information needs. In this dissertation, we show how the addition of a simple user interaction mechanism, find-similar, can improve retrieval quality by making it easier for users to navigate from relevant documents to other relevant documents. Find-similar allows a user to request documents similar to a given document. In the first part of the dissertation, we measure find-similar's retrieval potential through simulation of a user's behavior with hypothetical user interfaces. We show that find-similar has the potential to improve the retrieval quality of a state-of-the-art IR system by 23% and match the performance of relevance feedback. As part of a case study that first shows how find-similar can help PubMed users find relevant documents, we then show how find-similar responds to varying initial conditions and acts to compensate for poor retrieval quality. In the second part of the dissertation, we characterize find-similar in the absence of a particular user interface by measuring the quality of the document networks formed by find-similar's document-to-document similarity measure. Find-similar effectively creates links between documents that allow the user to navigate documents by similarity. We show that find-similar's similarity measure affects the navigability of the document network and how a query-biased similarity measure can improve find-similar. We develop measures of network navigability and show that find-similar should make the World Wide Web more navigable. Taken together, the simulation of find-similar and the measurement of the navigability of document networks shows how find-similar as a simple user interaction mechanism can improve a user's ability to find relevant documents.