Performance of self-taught documents: exploiting co-relevance structure in a document collection

Authors:
Abraham Bookstein
Affiliations:
Graduate Library School, University of Chicago
Venue:
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
1986

Citing 3
Cited 2

Output ranking methodology for document-clustering-based Boolean retrieval systems

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
A learning algorithm applied to document redescription

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Explanation and generalization of vector models in information retrieval

SIGIR '82 Proceedings of the 5th annual ACM conference on Research and development in information retrieval

A framework for effective retrieval

ACM Transactions on Database Systems (TODS)
Recent trends in automatic information retrieval

Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we study the behavior of an information retrieval system in which index terms are assigned at random to both documents and requests. The random indexing is then modified by means of a feedback mechanism derived from a normal probability model and applied to both the request and document representations. Of interest is the convergence properties of the representation vectors. After few feedback iterations, it is found that well defined clusters form that accurately represent the corelevance structure among the documents—in effect the feedback mechanism has permitted the documents to index themselves. This approach offers an interesting way to extend the dimensionality of the indexing vocabulary. Both this application and a theoretical analysis of the impact of extending the indexing vocabulary are discussed.