A study of the overlap among document representations

  • Authors:
  • Padima Das-Gupta;Jeffrey Katzer

  • Affiliations:
  • Syracuse University, Syracuse, N.Y.;Syracuse University, Syracuse, N.Y.

  • Venue:
  • SIGIR '83 Proceedings of the 6th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1983

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most previous investigations comparing the performance of different representations have used recall and precision as performance measures. However, there is evidence to show that these measures are insensitive to an important difference between representations. To explain, two representations may perform similarly on these measures, while retrieving very different sets of documents. Equivalence of representations should be decided on the basis of similarity in performance and similarity in the documents retrieved. This study compared the performance of four representations in the PsycAbs database. In addition, overlap between retrieved sets was also computed where overlap is the proportion of retrieved documents that are the same for pairs of document representations. Results indicate that for any two representations considered, performance values differed slightly while overlap scores were also low, thus supporting the evidence that recall and precision as performance measures mask differences between the sets of retrieved documents. Results are interpreted to propose an optimal ordering of the representations and to examine the contribution of each representation given this combination.