Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Combining multiple evidence from different properties of weighting schemes
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieving spoken documents by combining multiple index sources
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Challenging research issues in data mining, databases and information retrieval
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
Users tend to express their queries in various ways: sometimes they use more general terms, sometimes more specific terms. Information retrieval systems need to be able to accommodate this variety of user needs. Some retrieval models perform better when the queries are general, others perform better when the queries are more specific, and others when a combination is available. In this paper we are looking for a system that will perform well in all these cases, we present a new method for combining the results of different models in order to improve the performance on a difficult task: Information Retrieval from spontaneous speech. Our technique is based on clustering the training topics according to their tf-idf (term frequency-inverse document frequency) properties, and selecting the best models for each cluster. When the system runs on a test topic, the cluster of the topic needs to be determined and the combination of models of this cluster is used. We report improvements on the Malach collection used at CLEF-CLSR 2007.