A general language model for information retrieval

Authors:
Fei Song;W. Bruce Croft
Affiliations:
Dept. of Computing and Info. Science, University of Guelph, Guelph, Ontario, Canada N1G 2W1;Dept. of Computer Science, University of Massachusetts, Amherst, Massachusetts
Venue:
Proceedings of the eighth international conference on Information and knowledge management
Year:
1999

Citing 7
Cited 99

Text retrieval and inference

Text-based intelligent systems
TREC and TIPSTER experiments with INQUERY

TREC-2 Proceedings of the second conference on Text retrieval conference
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
A hidden Markov model information retrieval system

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval

A language modeling approach to information retrieval
Automatic Information Organization and Retrieval.

Automatic Information Organization and Retrieval.

Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The role of variance in term weighting for probabilistic information retrieval

Proceedings of the eleventh international conference on Information and knowledge management
Capturing term dependencies using a language model based on sentence trees

Proceedings of the eleventh international conference on Information and knowledge management
A language modeling framework for resource selection and results merging

Proceedings of the eleventh international conference on Information and knowledge management
EDGAR-analyzer: automating the analysis of corporate data contained in the SEC's EDGAR database

Decision Support Systems - Web retrieval and mining
Information Retrieval Based on Statistical Language Models

ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion

ACM Transactions on Asian Language Information Processing (TALIP)
Dependence language model for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Parsimonious language models for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Formal multiple-bernoulli models for language modeling

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to the special issue on statistical language modeling

ACM Transactions on Asian Language Information Processing (TALIP)
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents

ACM Transactions on Asian Language Information Processing (TALIP)
Structured queries, language modeling, and relevance modeling in cross-language information retrieval

Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Linear discriminant model for information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Question answering passage retrieval using dependency relations

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Gravitation-based model for information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Set-based vector model: An efficient approach for correlation-based ranking

ACM Transactions on Information Systems (TOIS)
A writer identification and verification system

Pattern Recognition Letters
Two-stage statistical language models for text database selection

Information Retrieval
Using controlled query generation to evaluate blind relevance feedback algorithms

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Mining dependency relations for query expansion in passage retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Discretization based learning approach to information retrieval

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Latent concept expansion using markov random fields

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A new approach for evaluating query expansion: query-document term mismatch

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching

Proceedings of the 6th ACM international conference on Image and video retrieval
Database selection using actual physical and acquired logical collection resources in a massive domain-specific operational environment

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Natural language processing for information retrieval: the time is ripe (again)

Proceedings of the ACM first Ph.D. workshop in CIKM
Automatic feature selection in the markov random field model for information retrieval

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A comparative study of probabilistic and language models for information retrieval

ADC '08 Proceedings of the nineteenth conference on Australasian database - Volume 75
Learning to rank with ties

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A lattice-based approach to query-by-example spoken document retrieval

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A few examples go a long way: constructing query models from elaborate query formulations

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A new probabilistic retrieval model based on the dirichlet compound multinomial distribution

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting proximity feature in bigram language model for information retrieval

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Utilizing phrase based semantic information for term dependency

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Entity Ranking from Annotated Text Collections Using Multitype Topic Models

Focused Access to XML Documents
Investigating external corpus and clickthrough statistics for query expansion in the legal domain

Proceedings of the 17th ACM conference on Information and knowledge management
Answering general time sensitive queries

Proceedings of the 17th ACM conference on Information and knowledge management
A novel retrieval approach reflecting variability of syntactic phrase representation

Journal of Intelligent Information Systems
Automatic query structuring from sentences for Japanese web retrieval

Proceedings of the 2nd ACM workshop on Improving non english web searching
Word Topic Models for Spoken Document Retrieval and Transcription

ACM Transactions on Asian Language Information Processing (TALIP)
Predicting the readability of short web summaries

Proceedings of the Second ACM International Conference on Web Search and Data Mining
A relevance model for a data warehouse contextualized with documents

Information Processing and Management: an International Journal
Using Contextual Information to Improve Search in Email Archives

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Query performance prediction for information retrieval based on covering topic score

Journal of Computer Science and Technology
Efficient storage and retrieval of probabilistic latent semantic information for information retrieval

The VLDB Journal — The International Journal on Very Large Data Bases
A syntactic tree matching approach to finding similar questions in community-based qa services

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A proximity language model for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Positional language models for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Personalized text snippet extraction using statistical language models

Pattern Recognition
Beyond bags of words: modeling implicit user preferences in information retrieval

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Sentiment retrieval using generative models

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Intention-focused active reranking for image object retrieval

Proceedings of the 18th ACM conference on Information and knowledge management
Locally contextualized smoothing of language models for sentiment sentence retrieval

Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion
Statistical lattice-based spoken document retrieval

ACM Transactions on Information Systems (TOIS)
Optimizing language model information retrieval system with expectation maximization algorithm

ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Improved search for socially annotated data

Proceedings of the VLDB Endowment
Word or phrase?: learning which unit to stress for information retrieval

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Automatic image captioning from the web for GPS photographs

Proceedings of the international conference on Multimedia information retrieval
Using social annotations to smooth the language model for IR

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A multi-dependency language modeling approach to information retrieval

PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
A statistical view of binned retrieval models

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Viewing term proximity from a different perspective

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Adapting boosting for information retrieval measures

Information Retrieval
Mining positive and negative patterns for relevance feature discovery

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Generating image descriptions using dependency relational patterns

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Event-based hyperspace analogue to language for query expansion

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Selected new training documents to update user profile

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Ranking related entities: components and analyses

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Clickthrough-based translation models for web search: from word models to phrase models

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Using various term dependencies according to their utilities

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Combining term-based and category-based representations for entity search

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
MRIM-LIG at ImageCLEF 2009: robotvision, image annotation and retrieval tasks

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
A pattern mining approach for information filtering systems

Information Retrieval
CRTER: using cross terms to enhance probabilistic information retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Query modeling for entity search based on terms, categories, and examples

ACM Transactions on Information Systems (TOIS)
Word sense language model for information retrieval

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Category-based query modeling for entity search

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Binary lexical relations for text representation in information retrieval

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
LSM: language sense model for information retrieval

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Hierarchical language models for XML component retrieval

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
TIJAH at INEX 2004 modeling phrases and relevance feedback

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Focused retrieval using topical language and structure

FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
IR-BASE: An integrated framework for the research and teaching of information retrieval technologies

TLIR'07 Proceedings of the First international conference on Teaching and Learning of Information Retrieval
Visual graph modeling for scene recognition and mobile robot localization

Multimedia Tools and Applications
Translation techniques in cross-language information retrieval

ACM Computing Surveys (CSUR)
Word category disambiguation for malayalam: a language model approach

Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Exploiting visual word co-occurrence for image retrieval

Proceedings of the 20th ACM international conference on Multimedia
Two-part segmentation of text documents

Proceedings of the 21st ACM international conference on Information and knowledge management
Approximate document outlier detection using random spectral projection

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Ontology-based personalised retrieval in support of reminiscence

Knowledge-Based Systems
Modeling click-through based word-pairs for web search

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Leveraging relevance cues for language modeling in speech recognition

Information Processing and Management: an International Journal
Reading contexts for structured documents retrieval

Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Incorporating social anchors for ad hoc retrieval

Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
A novel neighborhood based document smoothing model for information retrieval

Information Retrieval
CoBAn: A context based model for data leakage prevention

Information Sciences: an International Journal
Latent word context model for information retrieval

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical language modeling has been successfully used for speech recognition, part-of-speech tagging, and syntactic parsing. Recently, it has also been applied to information retrieval. According to this new paradigm, each document is viewed as a language sample, and a query as a generation process. The retrieved documents are ranked based on the probabilities of producing a query from the corresponding language models of these documents. In this paper, we will present a new language model for information retrieval, which is based on a range of data smoothing techniques, including the Good-Turning estimate, curve-fitting functions, and model combinations. Our model is conceptually simple and intuitive, and can be easily extended to incorporate probabilities of phrases such as word pairs and word triples. The experiments with the Wall Street Journal and TREC4 data sets showed that the performance of our model is comparable to that of INQUERY and better than that of another language model for information retrieval. In particular, word pairs are shown to be useful in improving the retrieval performance.