Pairwise similarity of TopSig document signatures

  • Authors:
  • Christopher M. De Vries;Shlomo Geva

  • Affiliations:
  • Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia

  • Venue:
  • Proceedings of the Seventeenth Australasian Document Computing Symposium
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper analyses the pairwise distances of signatures produced by the TopSig retrieval model on two document collections. The distribution of the distances are compared to purely random signatures. It explains why TopSig is only competitive with state of the art retrieval models at early precision. Only the local neighbourhood of the signatures is interpretable. We suggest this is a common property of vector space models.