A locally adaptive data compression scheme
Communications of the ACM
Software—Practice & Experience
Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Enhanced word-based block-sorting text compression
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Data Compression Using Encrypted Text
DCC '96 Proceedings of the Conference on Data Compression
The entropy of English using PPM-based models
DCC '96 Proceedings of the Conference on Data Compression
DCC '97 Proceedings of the Conference on Data Compression
Improving Text Compression Ratios with the Burrows-Wheeler Transform
DCC '99 Proceedings of the Conference on Data Compression
Dictionary-Based Fast Transform for Text Compression
ITCC '03 Proceedings of the International Conference on Information Technology: Computers and Communications
Preprocessing Text to Improve Compression Ratios
DCC '98 Proceedings of the Conference on Data Compression
Higher Compression from the Burrows-Wheeler Transform by Modified Sorting
DCC '98 Proceedings of the Conference on Data Compression
Parsing Strategies for BWT Compression
DCC '01 Proceedings of the Data Compression Conference
LIPT: A Reversible Lossless Text Transform to Improve Compression Performance
DCC '01 Proceedings of the Data Compression Conference
Combining PPM Models Using A Text Mining Approach
DCC '01 Proceedings of the Data Compression Conference
Higher compression from the burrows-wheeler transform with new algorithms for the list update problem
Revisiting dictionary-based compression: Research Articles
Software—Practice & Experience
The use of genetic programming for adaptive text compression
International Journal of Information and Coding Theory
Post BWT stages of the Burrows–Wheeler compression algorithm
Software—Practice & Experience
Natural Language Compression on Edge-Guided text preprocessing
Information Sciences: an International Journal
FXProj: a fuzzy XML documents projected clustering based on structure and content
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Hi-index | 14.98 |
Several preprocessing algorithms for text files are presented which complement each other and which are performed prior to the compression scheme. The algorithms need no external dictionary and are language independent. The compression gain is compared along with the costs of speed for the BWT, PPM, and LZ compression schemes. The average overall compression gain is in the range of 3 to 5 percent for the text files of the Calgary Corpus and between 2 to 9 percent for the text files of the large Canterbury Corpus.