Algorithms for within-cluster searches using inverted files

Authors:
Ismail Sengor Altingovde;Fazli Can;Özgür Ulusoy
Affiliations:
Department of Computer Engineering, Bilkent University, Ankara, Turkey;Department of Computer Engineering, Bilkent University, Ankara, Turkey;Department of Computer Engineering, Bilkent University, Ankara, Turkey
Venue:
ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences
Year:
2006

Citing 8
Cited 4

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases

ACM Transactions on Database Systems (TODS)
On the efficiency of best-match cluster searches

Information Processing and Management: an International Journal
Information Retrieval

Information Retrieval
Managing Gigabytes: Compressing and Indexing Documents and Images

Managing Gigabytes: Compressing and Indexing Documents and Images
Efficiency and effectiveness of query processing in cluster-based retrieval

Information Systems
Performance of query processing implementations in ranking-based text retrieval systems using inverted indices

Information Processing and Management: an International Journal
Optimization of restricted searches in web directories using hybrid data structures

ECIR'03 Proceedings of the 25th European conference on IR research

Large-scale cluster-based retrieval experiments on Turkish texts

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental cluster-based retrieval using compressed cluster-skipping inverted files

ACM Transactions on Information Systems (TOIS)
Efficient processing of category-restricted queries for web directories

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Cluster searching strategies for collaborative recommendation systems

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information retrieval over clustered document collections has two successive stages: first identifying the best-clusters and then the best-documents in these clusters that are most similar to the user query. In this paper, we assume that an inverted file over the entire document collection is used for the latter stage. We propose and evaluate algorithms for within-cluster searches, i.e., to integrate the best-clusters with the best-documents to obtain the final output including the highest ranked documents only from the best-clusters. Our experiments on a TREC collection including 210,158 documents with several query sets show that an appropriately selected integration algorithm based on the query length and system resources can significantly improve the query evaluation efficiency.