ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
Adding compression to a full-text retrieval system
Software—Practice & Experience
Bitmap index design and evaluation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Information retrieval on the web
ACM Computing Surveys (CSUR)
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Binary Interpolative Coding for Effective Index Compression
Information Retrieval
Burrows--Wheeler compression with variable length integer codes
Software—Practice & Experience
Indexing and Retrieval for Genomic Databases
IEEE Transactions on Knowledge and Data Engineering
Information Retrieval
ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
Variable-length Codes for Data Compression
Variable-length Codes for Data Compression
Performance of compressed inverted list caching in search engines
Proceedings of the 17th international conference on World Wide Web
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Index compression using 64-bit words
Software—Practice & Experience
VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improved index compression techniques for versioned document collections
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Extended Golomb Code for Integer Representation
IEEE Transactions on Multimedia
Run-length encodings (Corresp.)
IEEE Transactions on Information Theory
Universal codeword sets and representations of the integers
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Data compression has been widely used in many Information Retrieval based applications like web search engines, digital libraries, etc. to enable the retrieval of data to be faster. In these applications, universal codes Elias codes EC, Fibonacci code FC, Rice code RC, Extended Golomb code EGC, Fast Extended Golomb code FEGC etc. have been preferably used than statistical codes Huffman codes, Arithmetic codes etc. Universal codes are easy to be constructed and decoded than statistical codes. In this paper, the authors have proposed two methods to construct universal codes based on the ideas used in Rice code and Fast Extended Golomb Code. One of the authors' methods, Re-ordered FEGC, can be suitable to represent small, middle and large range integers where Rice code works well for small and middle range integers. It is also competing with FC, EGC and FEGC in representing small, middle and large range integers. But it could be faster in decoding than FC, EGC and FEGC. The authors' another coder, Block based RFEGC, uses local divisor rather than global divisor to improve the performance both compression and decompression of RFEGC. To evaluate the performance of the authors' coders, the authors have applied their methods to compress the integer values of the inverted files constructed from TREC, Wikipedia and FIRE collections. Experimental results show that their coders achieve better performance both compression and decompression for those files which contain significant distribution of middle and large range integers.