Parsimonious language models for information retrieval

Authors:
Djoerd Hiemstra;Stephen Robertson;Hugo Zaragoza
Affiliations:
University of Twente, Enschede, The Netherlands;Mircrosoft Research, Cambridge, U.K.;Mircrosoft Research, Cambridge, U.K.
Venue:
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2004

Citing 17
Cited 44

Relevance: the whole history

Journal of the American Society for Information Science - Special topic issue on the history of documentation and information science: part II
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
Evaluating a probabilistic model for cross-lingual information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval

Information Retrieval
Improved modeling and efficiency for automatic transcription of Broadcast News

Speech Communication - Special issue on automatic transcription of broadcast news data
The Importance of Prior Probabilities for Entry Page Search

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and redundancy detection in adaptive filtering

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical cross-language information retrieval using n-best query translations

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Bayesian models for visual information retrieval

Bayesian models for visual information retrieval

Relevance information: a loss of entropy but a gain for IDF?

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive document clustering based on query-based similarity

Information Processing and Management: an International Journal
An empirical study of query expansion and cluster-based retrieval in language modeling approach

Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
Parsimonious translation models for information retrieval

Information Processing and Management: an International Journal
Fast exact maximum likelihood estimation for mixture of language models

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Fast exact maximum likelihood estimation for mixture of language model

Information Processing and Management: an International Journal
A few examples go a long way: constructing query models from elaborate query formulations

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Using parsimonious language models on web data

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Parsimonious concept modeling

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Parsimonious relevance models

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Improved query difficulty prediction for the web

Proceedings of the 17th ACM conference on Information and knowledge management
Topic models and a revisit of text-related applications

Proceedings of the 2nd PhD workshop on Information and knowledge management
Statistical Language Models for Information Retrieval A Critical Review

Foundations and Trends in Information Retrieval
Using wikipedia categories for ad hoc search

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Who said what to whom?: capturing the structure of debates

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Improvements that don't add up: ad-hoc retrieval results since 1998

Proceedings of the 18th ACM conference on Information and knowledge management
Helping people to choose for whom to vote. a web information system for the 2009 European elections

Proceedings of the 18th ACM conference on Information and knowledge management
A Latent Dirichlet Framework for Relevance Modeling

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
The Probabilistic Relevance Framework: BM25 and Beyond

Foundations and Trends in Information Retrieval
A statistical view of binned retrieval models

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Revisit of nearest neighbor test for direct evaluation of inter-document similarities

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Conceptual language models for domain-specific retrieval

Information Processing and Management: an International Journal
A cross-lingual framework for monolingual biomedical information retrieval

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Focused retrieval and result aggregation with political data

Information Retrieval
Focused search in books and Wikipedia: categories, links and relevance feedback

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Unified access to heterogeneous data in cultural heritage

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Deriving implicit user feedback from partial URLs for effective web page retrieval

RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Sentence-level contextual opinion retrieval

Proceedings of the 20th international conference companion on World wide web
Building queries for prior-art search

IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Word clouds of multiple search results

IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Focus and element length for book and wikipedia retrieval

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Predicting document effectiveness in pseudo relevance feedback

Proceedings of the 20th ACM international conference on Information and knowledge management
An empirical study of query expansion and cluster-based retrieval in language modeling approach

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Effective query model estimation using parsimonious translation model in language modeling approach

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
How different are language models andword clouds?

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Estimation of query model from parsimonious translation model

AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Focused retrieval using topical language and structure

FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Memory-restricted latent semantic analysis to accumulate term-document co-occurrence events

Pattern Recognition Letters
Identifying entity aspects in microblog posts

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Exploiting the category structure of Wikipedia for entity ranking

Artificial Intelligence
A novel TF-IDF weighting scheme for effective ranking

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A novel neighborhood based document smoothing model for information retrieval

Information Retrieval
Recommending program committee candidates for academic conferences

Proceedings of the 2013 workshop on Computational scientometrics: theory & applications
Partial-update dimensionality reduction for accumulating co-occurrence events

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly address the relation between levels of language models that are typically used for smoothing. As such, they need fewer (non-zero) parameters to describe the data. We apply parsimonious models at three stages of the retrieval process: 1) at indexing time; 2) at search time; 3) at feedback time. Experimental results show that we are able to build models that are significantly smaller than standard models, but that still perform at least as well as the standard approaches.