A clustered search algorithm incorporating arbitrary term dependencies

Authors:
K. Lam;C. T. Yu
Affiliations:
Hong Kong Univ., Hong Kong;Univ. of Illinois at Chicago, Chicago
Venue:
ACM Transactions on Database Systems (TODS)
Year:
1982

Citing 5
Cited 6

On the estimation of the number of desired records with respect to a given query

ACM Transactions on Database Systems (TODS)
The Association Factor in Information Retrieval

Journal of the ACM (JACM)
On the Construction of Feedback Queries

Journal of the ACM (JACM)
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing

Optimum probability estimation based on expectations

SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
A framework for effective retrieval

ACM Transactions on Database Systems (TODS)
Adaptive document clustering

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive information system design: one query at a time

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
A Statistical Method for Estimating the Usefulness of Text Databases

IEEE Transactions on Knowledge and Data Engineering
Determining Text Databases to Search in the Internet

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The documents in a database are organized into clusters, where each cluster contains similar documents and a representative of these documents. A user query is compared with all the representatives of the clusters, and on the basis of such comparisons, those clusters having many close neighbors with respect to the query are selected for searching. This paper presents an estimation of the number of close neighbors in a cluster in relation to the given query. The estimation takes into consideration the dependencies between terms. It is demonstrated by experiments that the estimate is accurate and the time to generate the estimate is small.