Implementations of partial document ranking using inverted files
Information Processing and Management: an International Journal
Inverted File Partitioning Schemes in Multiple Disk Systems
IEEE Transactions on Parallel and Distributed Systems
Query evaluation: strategies and optimizations
Information Processing and Management: an International Journal
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Combining fuzzy information from multiple systems (extended abstract)
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Optimization of inverted vector searches
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Compressed inverted files with reduced decoding overheads
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
ACM Transactions on Information Systems (TOIS)
The stochastic approach for link-structure analysis (SALSA) and the TKC effect
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Breadth-first crawling yields high-quality pages
Proceedings of the 10th international conference on World Wide Web
Building a distributed full-text index for the Web
Proceedings of the 10th international conference on World Wide Web
Finding authorities and hubs from link structures on the World Wide Web
Proceedings of the 10th international conference on World Wide Web
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ACM Transactions on Internet Technology (TOIT)
Vector-space ranking with effective early termination
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Static index pruning for information retrieval systems
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 11th international conference on World Wide Web
Modern Information Retrieval
Combining fuzzy information: an overview
ACM SIGMOD Record
Lessons from Giant-Scale Services
IEEE Internet Computing
Performance of Inverted Indices in Distributed Text Document Retrieval Systems
PDIS '93 Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems
Optimizing Multi-Feature Queries for Image Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Using Fagin's Algorithm for Merging Ranked Results in Multimedia Middleware
COOPIS '99 Proceedings of the Fourth IECIS International Conference on Cooperative Information Systems
On the Resemblance and Containment of Documents
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Query Processing Issues in Image(Multimedia) Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Towards Efficient Multi-Feature Queries in Heterogeneous Environments
ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Evaluating Top-k Queries over Web-Accessible Databases
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Design and Implementation of a High-Performance Distributed Web Crawler
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Scalable distributed architectures for information retrieval
Scalable distributed architectures for information retrieval
Optimizing result prefetching in web search engines with segmented indices
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
Efficient and self-tuning incremental query expansion for top-k query processing
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An efficient and versatile query engine for TopX search
VLDB '05 Proceedings of the 31st international conference on Very large data bases
KLEE: a framework for distributed top-k query algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Static score bucketing in inverted indexes
Proceedings of the 14th ACM international conference on Information and knowledge management
Information Processing and Management: an International Journal
Efficient query processing in geographic web search engines
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
IO-Top-k: index-access optimized top-k query processing
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Architecture of a grid-enabled Web search engine
Information Processing and Management: an International Journal
Hybrid global-local indexing for effcient peer-to-peer information retrieval
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Pruning policies for two-tiered inverted index with correctness guarantee
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Efficiency-quality tradeoffs for vector score aggregation
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Top-k query evaluation with probabilistic guarantees
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
High performance index build algorithms for intranet search engines
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Effective top-k computation in retrieving structured documents with term-proximity support
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Best position algorithms for top-k queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Incremental cluster-based retrieval using compressed cluster-skipping inverted files
ACM Transactions on Information Systems (TOIS)
Relaxation in text search using taxonomies
Proceedings of the VLDB Endowment
Dynamic faceted search for discovery-driven analysis
Proceedings of the 17th ACM conference on Information and knowledge management
MedSearch: a specialized search engine for medical information retrieval
Proceedings of the 17th ACM conference on Information and knowledge management
Can phrase indexing help to process non-phrase queries?
Proceedings of the 17th ACM conference on Information and knowledge management
Top-k aggregation using intersections of ranked inputs
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
A search-based method for forecasting ad impression in contextual advertising
Proceedings of the 18th international conference on World wide web
Nullification test collections for web spam and SEO
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Effective top-k computation with term-proximity support
Information Processing and Management: an International Journal
Distributed top-k aggregation queries at large
Distributed and Parallel Databases
Leveraging a scalable row store to build a distributed text index
Proceedings of the first international workshop on Cloud data management
Probabilistic static pruning of inverted files
ACM Transactions on Information Systems (TOIS)
Revisiting globally sorted indexes for efficient document retrieval
Proceedings of the third ACM international conference on Web search and data mining
Beyond pages: supporting efficient, scalable entity search with dual-inversion index
Proceedings of the 13th International Conference on Extending Database Technology
Efficient term proximity search with term-pair indexes
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A web search engine model based on index-query bit-level compression
Proceedings of the 1st International Conference on Intelligent Semantic Web-Services and Applications
Batch query processing for web search engines
Proceedings of the fourth ACM international conference on Web search and data mining
Faster top-k document retrieval using block-max indexes
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Efficiently encoding term co-occurrences in inverted indexes
Proceedings of the 20th ACM international conference on Information and knowledge management
Indexing shared content in information retrieval systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Optimized top-k processing with global page scores on block-max indexes
Proceedings of the fifth ACM international conference on Web search and data mining
Index ordering by query-independent measures
Information Processing and Management: an International Journal
When big data leads to lost data
Proceedings of the 5th Ph.D. workshop on Information and knowledge
Processing continuous text queries featuring non-homogeneous scoring functions
Proceedings of the 21st ACM international conference on Information and knowledge management
Reordering an index to speed query processing without loss of effectiveness
Proceedings of the Seventeenth Australasian Document Computing Symposium
Optimizing top-k document retrieval strategies for block-max indexes
Proceedings of the sixth ACM international conference on Web search and data mining
Development of a Novel Compressed Index-Query Web Search Engine Model
International Journal of Information Technology and Web Engineering
Efficient parallel block-max WAND algorithm
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
Large web search engines have to answer thousands of queries per second with interactive response times. A major factor in the cost of executing a query is given by the lengths of the inverted lists for the query terms, which increase with the size of the document collection and are often in the range of many megabytes. To address this issue, IR and database researchers have proposed pruning techniques that compute or approximate term-based ranking functions without scanning over the full inverted lists. Over the last few years, search engines have incorporated new types of ranking techniques that exploit aspects such as the hyperlink structure of the web or the popularity of a page to obtain improved results. We focus on the question of how such techniques can be efficiently integrated into query processing. In particular, we study pruning techniques for query execution in large engines in the case where we have a global ranking of pages, as provided by Pagerank or any other method, in addition to the standard term-based approach. We describe pruning schemes for this case and evaluate their efficiency on an experimental cluster-based search engine with million web pages. Our results show that there is significant potential benefit in such techniques.