Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Interaction of query evaluation and buffer management for information retrieval
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Rank-preserving two-level caching for scalable search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
The Multi-Queue Replacement Algorithm for Second Level Buffer Caches
Proceedings of the General Track: 2002 USENIX Annual Technical Conference
Predictive caching and prefetching of query results in search engines
WWW '03 Proceedings of the 12th international conference on World Wide Web
Design and Implementation of a High-Performance Distributed Web Crawler
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Index compression using fixed binary codewords
ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
ARC: A Self-Tuning, Low Overhead Replacement Cache
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
ACM Transactions on Information Systems (TOIS)
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Improved Word-Aligned Binary Compression for Text Indexing
IEEE Transactions on Knowledge and Data Engineering
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Cost-aware WWW proxy caching algorithms
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
The impact of caching on search engines
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Index compression is good, especially for random access
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Admission policies for caches of search engine results
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Compressed perfect embedded skip lists for quick inverted-index lookups
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Using graphics processors for high-performance IR query processing
Proceedings of the 17th international conference on World Wide Web
Design trade-offs for search engine caching
ACM Transactions on the Web (TWEB)
Scheduling Intersection Queries in Term Partitioned Inverted Files
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Using graphics processors for high performance IR query processing
Proceedings of the 18th international conference on World wide web
Improved techniques for result caching in web search engines
Proceedings of the 18th international conference on World wide web
Nearest-neighbor caching for content-match applications
Proceedings of the 18th international conference on World wide web
Efficient Data Structure for XML Keyword Search
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Investigation of the accuracy of search engine hit counts
Journal of Information Science
Compressing term positions in web indexes
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Compact full-text indexing of versioned document collections
Proceedings of the 18th ACM conference on Information and knowledge management
Inverted indexes vs. bitmap indexes in decision support systems
Proceedings of the 18th ACM conference on Information and knowledge management
Entry Pairing in Inverted File
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
Index compression using 64-bit words
Software—Practice & Experience
On compressing the textual web
Proceedings of the third ACM international conference on Web search and data mining
Beyond pages: supporting efficient, scalable entity search with dual-inversion index
Proceedings of the 13th International Conference on Extending Database Technology
Scalable techniques for document identifier assignment in inverted indexes
Proceedings of the 19th international conference on World wide web
An efficient random access inverted index for information retrieval
Proceedings of the 19th international conference on World wide web
Search in social networks with access control
Proceedings of the 2nd International Workshop on Keyword Search on Structured Data
Active caching for similarity queries based on shared-neighbor information
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improved index compression techniques for versioned document collections
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Engineering basic algorithms of an in-memory text search engine
ACM Transactions on Information Systems (TOIS)
Advancing search query autocompletion services with more and better suggestions
ICWE'10 Proceedings of the 10th international conference on Web engineering
Batch query processing for web search engines
Proceedings of the fourth ACM international conference on Web search and data mining
Inverted index compression via online document routing
Proceedings of the 20th international conference on World wide web
Reordering columns for smaller indexes
Information Sciences: an International Journal
Efficient compressed inverted index skipping for disjunctive text-queries
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Faster temporal range queries over versioned text
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Posting list intersection on multicore architectures
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Faster top-k document retrieval using block-max indexes
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Using graph aggregation for service interaction message correlation
CAiSE'11 Proceedings of the 23rd international conference on Advanced information systems engineering
A query language for analyzing business processes execution
BPM'11 Proceedings of the 9th international conference on Business process management
Text vs. space: efficient geo-search query processing
Proceedings of the 20th ACM international conference on Information and knowledge management
Workload-aware indexing for keyword search in social networks
Proceedings of the 20th ACM international conference on Information and knowledge management
Optimized top-k processing with global page scores on block-max indexes
Proceedings of the fifth ACM international conference on Web search and data mining
Searching web data: An entity retrieval and high-performance indexing model
Web Semantics: Science, Services and Agents on the World Wide Web
Index ordering by query-independent measures
Information Processing and Management: an International Journal
Scalable search platform: improving pipelined query processing for distributed full-text retrieval
Proceedings of the 21st international conference companion on World Wide Web
Efficient top-k document retrieval using a term-document binary matrix
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
A five-level static cache architecture for web search engines
Information Processing and Management: an International Journal
Lossless asymmetric single instruction multiple data codec
Software—Practice & Experience
Optimizing top-k document retrieval strategies for block-max indexes
Proceedings of the sixth ACM international conference on Web search and data mining
Words context analysis for improvement of information retrieval
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
Development of a Novel Compressed Index-Query Web Search Engine Model
International Journal of Information Technology and Web Engineering
Scalable in situ scientific data encoding for analytical query processing
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
The impact of solid state drive on search engine cache management
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A candidate filtering mechanism for fast top-k query processing on modern cpus
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Dynamic memory allocation policies for postings in real-time Twitter search
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Context-aware top-K processing using views
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Exploiting query term correlation for list caching in web search engines
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Bitlist: new full-text index for low space cost and efficient keyword search
Proceedings of the VLDB Endowment
Second Chance: A Hybrid Approach for Dynamic Result Caching and Prefetching in Search Engines
ACM Transactions on the Web (TWEB)
Re-Ordered FEGC and Block Based FEGC for Inverted File Compression
International Journal of Information Retrieval Research
Document vector representations for feature extraction in multi-stage document ranking
Information Retrieval
Hi-index | 0.00 |
Due to the rapid growth in the size of the web, web search engines are facing enormous performance challenges. The larger engines in particular have to be able to process tens of thousands of queries per second on tens of billions of documents, making query throughput a critical issue. To satisfy this heavy workload, search engines use a variety of performance optimizations including index compression, caching, and early termination. We focus on two techniques, inverted index compression and index caching, which play a crucial rule in web search engines as well as other high-performance information retrieval systems. We perform a comparison and evaluation of several inverted list compression algorithms, including new variants of existing algorithms that have not been studied before. We then evaluate different inverted list caching policies on large query traces, and finally study the possible performance benefits of combining compression and caching. The overall goal of this paper is to provide an updated discussion and evaluation of these two techniques, and to show how to select the best set of approaches and settings depending on parameter such as disk speed and main memory cache size.