Coefficients of combining concept classes in a collection

  • Authors:
  • E. A. Fox;G. L. Nunn;W. C. Lee

  • Affiliations:
  • -;Dept. of Computer Science, Radford University, Radford, VA;Department of Computer Science, 562 McBryde Hall, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA

  • Venue:
  • SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

This report considers combining information to improve retrieval. The vector space model has been extended so different classes of data are associated with distinct concept types and their respective subvectors. Two collections with multiple concept types are described, ISI-1460 and CACM-3204. Experiments indicate that regression methods can help predict relevance, given query-document similarity values for each concept type. After sampling and transformation of data, the coefficient of determination for the best model was .48 (.66) for ISI (CACM). Average precision for the two collections was 11% (31%) better for probabilistic feedback with all types versus with terms only. These findings may be of particular interest to designers of document retrieval or hypertext systems since the role of links is shown to be especially beneficial.