A document-document similarity measure based on cited titles and probability theory, and its application to relevance feedback retrieval

  • Authors:
  • K. L. Kwok

  • Affiliations:
  • Queens College, CUNY, Flushing, NY

  • Venue:
  • SIGIR '84 Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1984

Quantified Score

Hi-index 0.00

Visualization

Abstract

The use of cited title terms of a scientific document for automatic indexing is explored. It offers a means of index term selection as well as term relevance weighting, based on author-provided relevance information and Bayes Theorem as in probabilistic retrieval. The latter quantitative consideration leads to a new measure of document-document similarity measure which is shown to have importance both for initial search and in relevance feedback retrieval, by offering a choice of iterative strategies.Extension of the concept of cited title terms to citing title terms shows that these two approaches are compatible with the current two competing models of probability of relevance for document retrieval (Robertson et al. 1982), if a document can also be regarded as a query. Their term usage may therefore provide the necessary statistics for parameter estimation to test both theories.