Simplified similarity scoring using term ranks

Authors:
Vo Ngoc Anh;Alistair Moffat
Affiliations:
The University of Melbourne, Victoria, Australia;The University of Melbourne, Victoria, Australia
Venue:
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2005

Citing 11
Cited 27

Pivoted document length normalization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Exploring the similarity space

ACM SIGIR Forum
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Evaluating evaluation measure stability

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Vector-space ranking with effective early termination

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Impact transformation: effective and efficient web retrieval

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Title language model for information retrieval

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Document normalization revisited

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002

ACM SIGIR Forum
Inverted Index Compression Using Word-Aligned Binary Codes

Information Retrieval

Pruned query evaluation using pre-computed impacts

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Pruning strategies for mixed-mode querying

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Efficient document retrieval in main memory

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Enhancing relevance scoring with chronological term rank

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Effective top-k computation in retrieving structured documents with term-proximity support

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Incremental cluster-based retrieval using compressed cluster-skipping inverted files

ACM Transactions on Information Systems (TOIS)
Site-based dynamic pruning for query processing in search engines

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Can phrase indexing help to process non-phrase queries?

Proceedings of the 17th ACM conference on Information and knowledge management
Term Impacts as Normalized Term Frequencies for BM25 Similarity Scoring

SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Semi-parametric and Non-parametric Term Weighting for Information Retrieval

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Inverted indexes vs. bitmap indexes in decision support systems

Proceedings of the 18th ACM conference on Information and knowledge management
Sorting out the document identifier assignment problem

ECIR'07 Proceedings of the 29th European conference on IR research
Improving MEDLINE document retrieval using automatic query expansion

ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
A statistical view of binned retrieval models

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
On the query reformulation technique for effective MEDLINE document retrieval

Journal of Biomedical Informatics
Reverted indexing for feedback and expansion

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Term frequency quantization for compressing an inverted index

AMT'10 Proceedings of the 6th international conference on Active media technology
Faster top-k document retrieval using block-max indexes

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Efficient query evaluation through access-reordering

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Graph-based term weighting for information retrieval

Information Retrieval
Adaptive term weighting through stochastic optimization

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Index ordering by query-independent measures

Information Processing and Management: an International Journal
Source selection for image retrieval in peer-to-peer networks

FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access
LePrEF: Learn to precompute evidence fusion for efficient query evaluation

Journal of the American Society for Information Science and Technology
An incremental approach to efficient pseudo-relevance feedback

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A candidate filtering mechanism for fast top-k query processing on modern cpus

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Fast candidate generation for real-time tweet search with bloom filter chains

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a method for document ranking that combines a simple document-centric view of text, and fast evaluation strategies that have been developed in connection with the vector space model. The new method defines the importance of a term within a document qualitatively rather than quantitatively, and in doing so reduces the need for tuning parameters. In addition, the method supports very fast query processing, with most of the computation carried out on small integers, and dynamic pruning an effective option. Experiments on a wide range of TREC data show that the new method provides retrieval effectiveness as good as or better than the Okapi BM25 formulation, and variants of language models.