A locally adaptive data compression scheme
Communications of the ACM
Parameterised compression for sparse bitmaps
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Combining fuzzy information: an overview
ACM SIGMOD Record
Binary Interpolative Coding for Effective Index Compression
Information Retrieval
Inverted file compression through document identifier reassignment
Information Processing and Management: an International Journal
Index Compression through Document Reordering
DCC '02 Proceedings of the Data Compression Conference
Multi-Tier Architecture for Web Search Engines
LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
Efficient query evaluation using a two-level retrieval process
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Assigning identifiers to documents to enhance the clustering property of fulltext indexes
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Index compression using fixed binary codewords
ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Improved Word-Aligned Binary Compression for Text Indexing
IEEE Transactions on Knowledge and Data Engineering
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Beyond PageRank: machine learning for static ranking
Proceedings of the 15th international conference on World Wide Web
Efficient search in large textual collections with redundancy
Proceedings of the 16th international conference on World Wide Web
A time machine for text search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Optimized query execution in large search engines with global page ordering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
On placing skips optimally in expectation
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Performance of compressed inverted list caching in search engines
Proceedings of the 17th international conference on World Wide Web
Efficient indexing of versioned document sequences
ECIR'07 Proceedings of the 29th European conference on IR research
Sorting out the document identifier assignment problem
ECIR'07 Proceedings of the 29th European conference on IR research
Indexing shared content in information retrieval systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Document identifier reassignment through dimensionality reduction
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Compressed perfect embedded skip lists for quick inverted-index lookups
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Compact full-text indexing of versioned document collections
Proceedings of the 18th ACM conference on Information and knowledge management
Inverted indexes vs. bitmap indexes in decision support systems
Proceedings of the 18th ACM conference on Information and knowledge management
Location cache for web queries
Proceedings of the 18th ACM conference on Information and knowledge management
On compressing the textual web
Proceedings of the third ACM international conference on Web search and data mining
Scalable techniques for document identifier assignment in inverted indexes
Proceedings of the 19th international conference on World wide web
A meta-index for querying distributed moving object database servers
Information Systems
New caching techniques for web search engines
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Efficient term proximity search with term-pair indexes
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improved index compression techniques for versioned document collections
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Engineering basic algorithms of an in-memory text search engine
ACM Transactions on Information Systems (TOIS)
Data structures: time, I/Os, entropy, joules!
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Term frequency quantization for compressing an inverted index
AMT'10 Proceedings of the 6th international conference on Active media technology
Compressed self-indices supporting conjunctive queries on document collections
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Colored range queries and document retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Batch query processing for web search engines
Proceedings of the fourth ACM international conference on Web search and data mining
Inverted index compression via online document routing
Proceedings of the 20th international conference on World wide web
Reordering columns for smaller indexes
Information Sciences: an International Journal
Efficient compressed inverted index skipping for disjunctive text-queries
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Proceedings of the VLDB Endowment
Faster temporal range queries over versioned text
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Faster top-k document retrieval using block-max indexes
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Factorization-based lossless compression of inverted indices
Proceedings of the 20th ACM international conference on Information and knowledge management
Text vs. space: efficient geo-search query processing
Proceedings of the 20th ACM international conference on Information and knowledge management
Optimized top-k processing with global page scores on block-max indexes
Proceedings of the fifth ACM international conference on Web search and data mining
Learning to distribute queries into web search nodes
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Searching web data: An entity retrieval and high-performance indexing model
Web Semantics: Science, Services and Agents on the World Wide Web
Scalable search platform: improving pipelined query processing for distributed full-text retrieval
Proceedings of the 21st international conference companion on World Wide Web
Intra-query concurrent pipelined processing for distributed full-text retrieval
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Modeling static caching in web search engines
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Distributed search based on self-indexed compressed text
Information Processing and Management: an International Journal
Optimizing positional index structures for versioned document collections
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
To index or not to index: time-space trade-offs in search engines with positional ranking functions
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Reordering an index to speed query processing without loss of effectiveness
Proceedings of the Seventeenth Australasian Document Computing Symposium
Dual-Sorted inverted lists in practice
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Implicit indexing of natural language text by reorganizing bytecodes
Information Retrieval
Improving the performance of pipelined query processing with skipping
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Optimizing top-k document retrieval strategies for block-max indexes
Proceedings of the sixth ACM international conference on Web search and data mining
Words context analysis for improvement of information retrieval
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
Development of a Novel Compressed Index-Query Web Search Engine Model
International Journal of Information Technology and Web Engineering
Spatial keyword query processing: an experimental evaluation
Proceedings of the VLDB Endowment
Scalable in situ scientific data encoding for analytical query processing
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A candidate filtering mechanism for fast top-k query processing on modern cpus
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Faster and smaller inverted indices with treaps
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Inverted indices for particle tracking in petascale cosmological simulations
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Dynamic memory allocation policies for postings in real-time Twitter search
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Context-aware top-K processing using views
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A term-based inverted index partitioning model for efficient distributed query processing
ACM Transactions on the Web (TWEB)
Efficient parallel block-max WAND algorithm
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Bitlist: new full-text index for low space cost and efficient keyword search
Proceedings of the VLDB Endowment
Re-Ordered FEGC and Block Based FEGC for Inverted File Compression
International Journal of Information Retrieval Research
Document vector representations for feature extraction in multi-stage document ranking
Information Retrieval
Efficient query processing for XML keyword queries based on the IDList index
The VLDB Journal — The International Journal on Very Large Data Bases
On the compression of search trees
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Web search engines use highly optimized compression schemes to decrease inverted index size and improve query throughput, and many index compression techniques have been studied in the literature. One approach taken by several recent studies first performs a renumbering of the document IDs in the collection that groups similar documents together, and then applies standard compression techniques. It is known that this can significantly improve index compression compared to a random document ordering. We study index compression and query processing techniques for such reordered indexes. Previous work has focused on determining the best possible ordering of documents. In contrast, we assume that such an ordering is already given, and focus on how to optimize compression methods and query processing for this case. We perform an extensive study of compression techniques for document IDs and present new optimizations of existing techniques which can achieve significant improvement in both compression and decompression performances. We also propose and evaluate techniques for compressing frequency values for this case. Finally, we study the effect of this approach on query processing performance. Our experiments show very significant improvements in index size and query processing speed on the TREC GOV2 collection of 25.2 million web pages.