Efficient passage ranking for document databases

Authors:
Marcin Kaszkiel;Justin Zobel;Ron Sacks-Davis
Affiliations:
RMIT Univ., Melbourne, Australia;RMIT Univ., Melbourne, Australia;RMIT Univ., Melbourne, Australia
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
1999

Citing 16
Cited 39

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text structuring and retrieval-experiments in automatic encyclopedia searching

SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval: data structures and algorithms

Information retrieval: data structures and algorithms
Subtopic structuring for full-length document access

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Passage-level evidence in document retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Document and passage retrieval based on hidden Markov models

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Overview of the second text retrieval conference (TREC-2)

TREC-2 Proceedings of the second conference on Text retrieval conference
The MG retrieval system: compressing for space and speed

Communications of the ACM
Fast evaluation of structured queries for information retrieval

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance feedback with too much data

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Filtered document retrieval with frequency-sorted indexes

Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval

ACM Transactions on Information Systems (TOIS)
Optimization of inverted vector searches

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Passage retrieval revisited

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Improving automatic query expansion

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Compressed inverted files with reduced decoding overheads

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval

Bit-sliced index arithmetic

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A language modelling approach to relevance profiling for document browsing

Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Extraction of field-coherent passages

Information Processing and Management: an International Journal
Using Long Queries in a Passage Retrieval System

MICAI '02 Proceedings of the Second Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Enhancing the Set-Based Model Using Proximity Information

SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Text Segmentation for Efficient Information Retrieval

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Searching large text collections

Handbook of massive data sets
An entity-relation approach to information retrieval

ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Image Retrieval Using Multiple Evidence Ranking

IEEE Transactions on Knowledge and Data Engineering
In-place versus re-build versus re-merge: index maintenance strategies for text retrieval systems

ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
An Evaluation of Passage-Based Text Categorization

Journal of Intelligent Information Systems
Concept and prototype of a collaborative business process environment for document processing

Data & Knowledge Engineering - Special issue: Collaborative business process technologies
Three-level caching for efficient query processing in large Web search engines

WWW '05 Proceedings of the 14th international conference on World Wide Web
A retrospective study of probabilistic context-based retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Set-based vector model: An efficient approach for correlation-based ranking

ACM Transactions on Information Systems (TOIS)
Generalized contextualization method for XML information retrieval

Proceedings of the 14th ACM international conference on Information and knowledge management
Inverted files for text search engines

ACM Computing Surveys (CSUR)
Performance of query processing implementations in ranking-based text retrieval systems using inverted indices

Information Processing and Management: an International Journal
Efficient online index maintenance for contiguous inverted lists

Information Processing and Management: an International Journal
Efficient query processing in geographic web search engines

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Integrating document and data retrieval based on XML

The VLDB Journal — The International Journal on Very Large Data Bases
A semantic approach to boost passage retrieval effectiveness for question answering

ACSC '06 Proceedings of the 29th Australasian Computer Science Conference - Volume 48
Pruning strategies for mixed-mode querying

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A retrospective study of a hybrid document-context based retrieval model

Information Processing and Management: an International Journal
Efficient search in large textual collections with redundancy

Proceedings of the 16th international conference on World Wide Web
Efficient on-line index maintenance for dynamic text collections by using dynamic balancing tree

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Interpreting TF-IDF term weights as making relevance decisions

ACM Transactions on Information Systems (TOIS)
Using graphics processors for high performance IR query processing

Proceedings of the 18th international conference on World wide web
Positional language models for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Revisiting globally sorted indexes for efficient document retrieval

Proceedings of the third ACM international conference on Web search and data mining
Hierarchical indexing and flexible element retrieval for structured document

ECIR'03 Proceedings of the 25th European conference on IR research
Efficient term proximity search with term-pair indexes

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Batch query processing for web search engines

Proceedings of the fourth ACM international conference on Web search and data mining
Text classification: a sequential reading approach

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Lexical and Syntactic knowledge for Information Retrieval

Information Processing and Management: an International Journal
Passage filtering for open-domain question answering

FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Structured index organizations for high-throughput text querying

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Size matters: exhaustive geometric verification for image retrieval accepted for ECCV 2012

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Improving passage ranking with user behavior information

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.01

Visualization

Abstract

Queries to text collections are resolved by ranking the documents in the collection and returning the highest-scoring documents to the user. An alternative retrieval method is to rank passages, that is, short fragments of documents, a strategy that can improve effectiveness and identify relevant material in documents that are too large for users to consider as a whole. However, ranking of passages can considerably increase retrieval costs. In this article we explore alternative query evaluation techniques, and develop new tecnhiques for evaluating queries on passages. We show experimentally that, appropriately implemented, effective passage retrieval is practical in limited memory on a desktop machine. Compared to passage ranking with adaptations of current document ranking algorithms, our new “DO-TOS” passage-ranking algorithm requires only a fraction of the resources, at the cost of a small loss of effectiveness.