Information Processing and Management: an International Journal
Effective text compression with simultaneous digram and trigram encoding
Journal of Information Science
In situ generation of compressed inverted files
Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Binary Interpolative Coding for Effective Index Compression
Information Retrieval
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
The indexable web is more than 11.5 billion pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Transaction time indexing with version compression
Proceedings of the VLDB Endowment
TinyLex: static n-gram index pruning with perfect recall
Proceedings of the 17th ACM conference on Information and knowledge management
Hi-index | 0.00 |
Nowadays, Web Information Retrieval (IR) Systems such as Google and Bing have become an essential part of using the Internet. One of the main components of an IR system is the indexing module that builds an index file to speed up the searching process. In this research project we aimed to evaluate the improvements of IR efficiency that results from encoding the terms of the index using Huffman code. The results showed a reduction of terms size by 40% which reduces the overall index size, and consequently index transfer time. The results also showed a reduction of the number of comparisons needed to process each query by 36% using binary search. This approach reduces CPU-time used to process each query which increases the retrieval efficiency.