Large-scale cluster-based retrieval experiments on Turkish texts

Authors:
Ismail Sengor Altingovde;Rifat Ozcan;Huseyin Cagdas Ocalan;Fazli Can;Özgür Ulusoy
Affiliations:
Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey
Venue:
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2007

Citing 3
Cited 3

Efficiency and effectiveness of query processing in cluster-based retrieval

Information Systems
First large-scale information retrieval experiments on turkish texts

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms for within-cluster searches using inverted files

ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences

Incremental cluster-based retrieval using compressed cluster-skipping inverted files

ACM Transactions on Information Systems (TOIS)
Efficient processing of category-restricted queries for web directories

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
A fuzzy ranking approach for improving search results in Turkish as an agglutinative language

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.02

Visualization

Abstract

We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.