Improved techniques for processing queries in full-text systems
SIGIR '87 Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval
Data compression: methods and theory
Data compression: methods and theory
Performance analysis and fundamental performance tradeoffs for CLV optical disks
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Compression of concordances in full-text retrieval systems
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Improved hierarchical bit-vector compression in document retrieval systems
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Efficient variants of Huffman codes in high level languages
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Signature files: an access method for documents and its analytical performance evaluation
ACM Transactions on Information Systems (TOIS)
Experiments in text file compression
Communications of the ACM
Common phrases and minimum-space text storage
Communications of the ACM
Cryptography, a Primer
Information Retrieval: Computational and Theoretical Aspects
Information Retrieval: Computational and Theoretical Aspects
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
On the use of bit maps for multiple key retrieval
Proceedings of the 1976 conference on Data : Abstraction, definition and structure
Compression, information theory, and grammars: a unified approach
ACM Transactions on Information Systems (TOIS)
Construction of optimal graphs for bit-vector compression
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Is Huffman coding dead? (extended abstract)
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Compression of indexes with full positional information in very large text databases
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient recompression techniques for dynamic full-text retrieval systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
A text compression scheme that allows fast searching directly in the compressed file
ACM Transactions on Information Systems (TOIS)
Modeling word occurrences for the compression of concordances
ACM Transactions on Information Systems (TOIS)
Skeleton Trees for the Efficient Decoding of Huffman Encoded Texts
Information Retrieval
Binary Interpolative Coding for Effective Index Compression
Information Retrieval
Improving Information Retrieval System Security via an Optimal Maximal Coding Scheme
EurAsia-ICT '02 Proceedings of the First EurAsian Conference on Information and Communication Technology
Pattern Matching in Huffman Encoded Texts
DCC '01 Proceedings of the Data Compression Conference
Balancing confidentiality and efficiency in untrusted relational DBMSs
Proceedings of the 10th ACM conference on Computer and communications security
Implementation of a Storage Mechanism for Untrusted DBMSs
SISW '03 Proceedings of the Second IEEE International Security in Storage Workshop
Modeling and assessing inference exposure in encrypted databases
ACM Transactions on Information and System Security (TISSEC)
Pattern matching in Huffman encoded texts
Information Processing and Management: an International Journal
Inverted files for text search engines
ACM Computing Surveys (CSUR)
A New Source Coding Scheme with Small Expected Length and Its Application to Simple Data Encryption
IEEE Transactions on Computers
The Security Hole in WAP: An Analysis of the Network and Business Rationales Underlying a Failure
International Journal of Electronic Commerce
An Efficient Matching Algorithm for Encoded DNA Sequences and Binary Strings
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Accelerating Boyer-Moore searches on binary texts
Theoretical Computer Science
Privacy-preserving similarity-based text retrieval
ACM Transactions on Internet Technology (TOIT)
Accelerating Boyer Moore searches on binary texts
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Reducing the length of Shannon-Fano-Elias codes and Shannon-Fano codes
MILCOM'06 Proceedings of the 2006 IEEE conference on Military communications
Hi-index | 0.00 |
The emergence of the CD-ROM as a storage medium for full-text databases raises the question of the maximum size database that can be contained by this medium. As an example, the problem of storing the Trésor de la Langue Fran&ccidel;aise on a CD-ROM is examined in this paper. The text alone of this database is 700 megabytes long, more than a CD-ROM can hold. In addition, the dictionary and concordance needed to access these data must be stored. A further constraint is that some of the material is copyrighted, and it is desirable that such material be difficult to decode except through software provided by the system. Pertinent approaches to compression of the various files are reviewed, and the compression of the text is related to the problem of data encryption: Specifically, it is shown that, under simple models of text generation, Huffman encoding produces a bit-string indistinguishable from a representation of coin flips.