A hidden Markov model information retrieval system
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval
Proceedings of the tenth international conference on Information and knowledge management
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
Dependence language model for information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based retrieval using language models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Integrating word relationships into language models
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Language model information retrieval with document expansion
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
An analysis on document length retrieval trends in language modeling smoothing
Information Retrieval
A risk minimization framework for information retrieval
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Hi-index | 0.00 |
Language models form a class of successful probabilistic models in information retrieval. However, knowledge of why some methods perform better than others in a particular situation remains limited. In this study we analyze what language model factors influence information retrieval performance. Starting from popular smoothing methods we review what data features have been used. Document length and a measure of document word distribution turned out to be the important factors, in addition to a distinction in estimating the probability of seen and unseen words. We propose a class of parameter-free smoothing methods, of which multiple specific instances are possible. Instead of parameter tuning however, an analysis of data features should be used to decide upon a specific method. Finally, we discuss some initial experiments.