PageRank without hyperlinks: Structural reranking using links induced by language models

  • Authors:
  • Oren Kurland;Lillian Lee

  • Affiliations:
  • Technion;Cornell University

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ad hoc retrieval task is to find documents in a corpus that are relevant to a query. Inspired by the PageRank and HITS (hubs and authorities) algorithms for Web search, we propose a structural reranking approach to ad-hoc retrieval that applies to settings with no hyperlink information. We reorder the documents in an initially retrieved set by exploiting implicit asymmetric relationships among them. We consider generation links, which indicate that the language model induced from one document assigns high probability to the text of another. We study a number of reranking criteria based on measures of centrality in the graphs formed by generation links, and show that integrating centrality into standard language-model-based retrieval is quite effective at improving precision at top ranks; the best resultant performance is comparable, and often superior, to that of a state-of-the-art pseudo-feedback-based retrieval approach. In addition, we demonstrate the merits of our language-model-based method for inducing interdocument links by comparing it to previously suggested notions of interdocument similarities (e.g., cosines within the vector-space model).We also show that ourmethods for inducing centrality are substantially more effective than approaches based on document-specific characteristics, several of which are novel to this study.