Query evaluation: strategies and optimizations
Information Processing and Management: an International Journal
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Local versus global link information in the Web
ACM Transactions on Information Systems (TOIS)
Efficient query evaluation using a two-level retrieval process
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Beyond PageRank: machine learning for static ranking
Proceedings of the 15th international conference on World Wide Web
Efficient document retrieval in main memory
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Optimized query execution in large search engines with global page ordering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
BrowseRank: letting web users vote for page importance
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Challenges in building large-scale information retrieval systems: invited talk
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
Sorting out the document identifier assignment problem
ECIR'07 Proceedings of the 29th European conference on IR research
How good is a span of terms?: exploiting proximity to improve web retrieval
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Quality-biased ranking of web documents
Proceedings of the fourth ACM international conference on Web search and data mining
Learning to rank with multiple objective functions
Proceedings of the 20th international conference on World wide web
Faster top-k document retrieval using block-max indexes
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Efficient and effective spam filtering and re-ranking for large web datasets
Information Retrieval
Efficient phrase querying with flat position index
Proceedings of the 20th ACM international conference on Information and knowledge management
Efficient query evaluation through access-reordering
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Efficient in-memory top-k document retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
The seventeenth australasian document computing symposium
ACM SIGIR Forum
Hi-index | 0.00 |
Following Long and Suel, we empirically investigate the importance of document order in search engines which rank documents using a combination of dynamic (query-dependent) and static (query-independent) scores, and use document-at-a-time (DAAT) processing. When inverted file postings are in collection order, assigning document numbers in order of descending static score supports lossless early termination while maintaining good compression. Since static scores may not be available until all documents have been gathered and indexed, we build a tool for reordering an existing index and show that it operates in less than 20% of the original indexing time. We note that this additional cost is easily recouped by savings at query processing time. We compare best early-termination points for several different index orders on three enterprise search collections (a whole-of-government index with two very different query sets, and a collection from a UK university). We also present results for the same orders for ClueWeb09-CatB. Our evaluation focuses on finding results likely to be clicked on by users of Web or website search engines --- Nav and Key results in the TREC 2011 Web Track judging scheme. The orderings tested are Original, Reverse, Random, and QIE (descending order of static score). For three enterprise search test sets we find that QIE order can achieve close-to-maximal search effectiveness with much lower computational cost than for other orderings. Additionally, reordering has negligible impact on compressed index size for indexes that contain position information. Our results for an artificial query set against the TREC ClueWeb09 Category B collection are much more equivocal and we canvass possible explanations for future investigation.