Query evaluation: strategies and optimizations
Information Processing and Management: an International Journal
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Combining fuzzy information from multiple systems (extended abstract)
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Optimization of inverted vector searches
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Static index pruning for information retrieval systems
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Combining fuzzy information: an overview
ACM SIGMOD Record
On the integration of structure indexes and inverted lists
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adaptive Processing of Top-k Queries in XML
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Improving Web search efficiency via a locality based static pruning method
WWW '05 Proceedings of the 14th international conference on World Wide Web
Simplified similarity scoring using term ranks
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Pruned query evaluation using pre-computed impacts
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A document-centric approach to static index pruning in text retrieval systems
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Pruning strategies for mixed-mode querying
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Optimized query execution in large search engines with global page ordering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Top-k query evaluation with probabilistic guarantees
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Term proximity scoring for keyword-based retrieval systems
ECIR'03 Proceedings of the 25th European conference on IR research
Can phrase indexing help to process non-phrase queries?
Proceedings of the 17th ACM conference on Information and knowledge management
Effective top-k computation with term-proximity support
Information Processing and Management: an International Journal
Revisiting globally sorted indexes for efficient document retrieval
Proceedings of the third ACM international conference on Web search and data mining
Efficient term proximity search with term-pair indexes
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Upper-bound approximations for dynamic pruning
ACM Transactions on Information Systems (TOIS)
High-performance processing of text queries with tunable pruned term and term pair indexes
ACM Transactions on Information Systems (TOIS)
Optimized top-k processing with global page scores on block-max indexes
Proceedings of the fifth ACM international conference on Web search and data mining
Two-Stage learning to rank for information retrieval
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Modern web search engines are expected to return top-k results efficiently given a query. Although many dynamic index pruning strategies have been proposed for efficient top-k computation, most of them are prone to ignore some especially important factors in ranking functions, e.g. term proximity (the distance relationship between query terms in a document). The inclusion of term proximity breaks the monotonicity of ranking functions and therefore leads to additional challenges for efficient query processing. This paper studies the performance of some existing top-k computation approaches using term-proximity-enabled ranking functions. Our investigation demonstrates that, when term proximity is incorporated into ranking functions, most existing index structures and top-k strategies become quite inefficient. According to our analysis and experimental results, we propose two index structures and their corresponding index pruning strategies: Structured and Hybrid, which performs much better on the new settings. Moreover, the efficiency of index building and maintenance would not be affected too much with the two approaches.