Large-scale cluster-based retrieval experiments on Turkish texts

  • Authors:
  • Ismail Sengor Altingovde;Rifat Ozcan;Huseyin Cagdas Ocalan;Fazli Can;Özgür Ulusoy

  • Affiliations:
  • Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey

  • Venue:
  • SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.02

Visualization

Abstract

We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.