Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
A statistical interpretation of term specificity and its application in retrieval
Document retrieval systems
Evaluation of an inference network-based retrieval model
ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A new method of weighting query terms for ad-hoc retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The probability ranking principle in IR
Readings in information retrieval
A theory of term weighting based on exploratory data analysis
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
A probabilistic model of information retrieval: development and comparative experiments
Information Processing and Management: an International Journal
A probabilistic model of information retrieval: development and comparative experiments Part 2
Information Processing and Management: an International Journal
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness
ACM Transactions on Information Systems (TOIS)
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
Parsimonious language models for information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A study of the dirichlet priors for term frequency normalisation
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
On setting the hyper-parameters of term frequency normalization for information retrieval
ACM Transactions on Information Systems (TOIS)
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation over thousands of queries
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
The Probabilistic Relevance Framework: BM25 and Beyond
Foundations and Trends in Information Retrieval
Retrieval constraints and word frequency distributions a log-logistic model for IR
Information Retrieval
Diagnostic Evaluation of Information Retrieval Models
ACM Transactions on Information Systems (TOIS)
A constraint to automatically regulate document-length normalisation
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Term weighting schemes are central to the study of information retrieval systems. This article proposes a novel TF-IDF term weighting scheme that employs two different within document term frequency normalizations to capture two different aspects of term saliency. One component of the term frequency is effective for short queries, while the other performs better on long queries. The final weight is then measured by taking a weighted combination of these components, which is determined on the basis of the length of the corresponding query. Experiments conducted on a large number of TREC news and web collections demonstrate that the proposed scheme almost always outperforms five state of the art retrieval models with remarkable significance and consistency. The experimental results also show that the proposed model achieves significantly better precision than the existing models.