A new ppm variant for chinese text compression
Natural Language Engineering
Hi-index | 0.00 |
This paper proposes the use of a crossbar-like tree structure to use with Dynamic Markov Compression (DMC) for the compression of Chinese text files. DMC had previously been found to be more effective than common compression techniques like compress and pack and gives a compression gain of between 13.1% and 32.0%. This initial structure is able to improve on DMCýs compression results, and outperforms the various initial structures commonly adopted, such as the single-state, linear, tree or braid structures by a gain ranging from 1.5% to 9.6%.