ACM Transactions on Database Systems (TODS)
On modeling of information retrieval concepts in vector spaces
ACM Transactions on Database Systems (TODS)
Algorithms for clustering data
Algorithms for clustering data
An approach to multikey sequencing in an equiprobable keyterm retrieval situation
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
On the estimation of the number of desired records with respect to a given query
ACM Transactions on Database Systems (TODS)
Conceptual Information Retrieval: A Case Study in Adaptive Partial Parsing
Conceptual Information Retrieval: A Case Study in Adaptive Partial Parsing
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Dynamic information and library processing
Dynamic information and library processing
Hi-index | 0.00 |
This work introduces a new approach to vector-model clustering where a hybrid algorithm is presented to cluster records based on a prescribed threshold value while taking into account the query patterns in a given database. The Hamming Distance of a file is used as a 'cheap' measure of space density. The objective of the algorithm is to minimize response time of a retrieval system by partly maximizing the space density of the file and ensuring that popular tuples remain in physical proximity in the file space. Simulation experiments conducted proved that a great reduction in response time is yielded after the restructuring of a file. Criteria, such as, block size, threshold value, percentage of records satisfying a given set of queries, etc., which affect clustering and response time are studied using statistical analysis.