Parameterised compression for sparse bitmaps
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Symbolic Boolean manipulation with ordered binary-decision diagrams
ACM Computing Surveys (CSUR)
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
A survey of information retrieval and filtering methods
A survey of information retrieval and filtering methods
Text databases and information retrieval
ACM Computing Surveys (CSUR)
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Logic Synthesis and Verification Algorithms
Logic Synthesis and Verification Algorithms
Logic Minimization Algorithms for VLSI Synthesis
Logic Minimization Algorithms for VLSI Synthesis
On B-Tree Indices for Skewed Distributions
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Recent Advances in BDD Based Representations for Boolean Functions: A Survey
VLSID '99 Proceedings of the 12th International Conference on VLSI Design - 'VLSI for the Information Appliance'
Journal of Network and Computer Applications
Hi-index | 0.00 |
One of the key challenges of managing very huge volumes of data in scalable Information retrieval systems is providing fast access through keyword searches. The major data structure in the information retrieval system is an inverted file, which records the positions of each term in the documents. When the information set substantially grows, the number of terms and documents are significantly increased as well as the size of the inverted files.Approaches to reduce the inverted file without sacrificing the query efficiency are important to the success of scalable information systems. In this paper, we propose a compression approach by using Binary Decision Diagram Encoding (BDD) so that all possible ordering correlation among large amount of documents will be extracted to minimize the posting representation. Another advantage of using BDD is that BDD expressions can efficiently perform Boolean queries, which are very common in retrieval systems. Experiment results show that the compression ratios of the inverted files have been improved significantly by the BDD scheme.