Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Text-based intelligent systems
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
On modeling information retrieval with probabilistic inference
ACM Transactions on Information Systems (TOIS)
Some inconsistencies and misidentified modeling assumptions in probabilistic information retrieval
ACM Transactions on Information Systems (TOIS)
Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Natural Language Processing and Digital Libraries
Information Extraction: Towards Scalable, Adaptable Systems
Disambiguation Strategies for Cross-Language Information Retrieval
ECDL '99 Proceedings of the Third European Conference on Research and Advanced Technology for Digital Libraries
Models in Information Retrieval
ESSIR '00 Proceedings of the Third European Summer-School on Lectures on Information Retrieval-Revised Lectures
Entity Ranking from Annotated Text Collections Using Multitype Topic Models
Focused Access to XML Documents
Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah
Focused Access to XML Documents
Conceptual language models for domain-specific retrieval
Information Processing and Management: an International Journal
Searching cultural heritage data: does structure help expert searchers?
RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Enriching document representation via translation for improved monolingual information retrieval
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Reliability and effectiveness of clickthrough data for automatic image annotation
Multimedia Tools and Applications
Using structural relationships for focused XML retrieval
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Surface features in video retrieval
AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
A declarative DB-Powered approach to IR
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
A language model which integrates uncertainty
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
BibRank: a language-based model for co-ranking entities in bibliographic networks
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Robust recommendations using regularized link analysis of browsing behavior graphs
SBP'12 Proceedings of the 5th international conference on Social Computing, Behavioral-Cultural Modeling and Prediction
Comparison of information retrieval models for question answering
Proceedings of the Fifth Balkan Conference in Informatics
Thesaurus-based feedback to support mixed search and browsing environments
ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Probabilistic co-relevance for query-sensitive similarity measurement in information retrieval
Information Processing and Management: an International Journal
Optimizing ranking method using social annotations based on language model
Artificial Intelligence Review
Hi-index | 0.00 |
This paper presents a new probabilistic model of information retrieval. The most important modeling assumption made is that documents and queries are defined by an ordered sequence of single terms. This assumption is not made in well known existing models of information retrieval, but is essential in the field of statistical natural language processing. Advances already made in statistical natural language processing will be used in this paper to formulate a probabilistic justification for using tf×idf term weighting. The paper shows that the new probabilistic interpretation of tf×idf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking. A pilot experiment on the Cranfield test collection indicates that the presented model outperforms the vector space model with classical tf×idf and cosine length normalisation.