Malware detection using adaptive data compression
Proceedings of the 1st ACM workshop on Workshop on AISec
Hi-index | 0.00 |
The Prediction by Partial Matching (PPM) algorithm uses a cumulative frequency count of input symbols in different contexts to estimate their probabilitydistribution. Excellent compression ratios yielded by the PPM algorithm havenot instigated broader use of this scheme mainly because of its high demand forcomputational resources. In this paper, we present an algorithm which improvesthe memory usage by the PPM model. The algorithm heuristically identifies andremoves portions of the PPM model which are not contributing toward bettermodeling of the input data. As a result, our algorithm improves the average compression ratio up to 7% under the memory limitation constraint at the expenseof increased computation. Under the constraint of maintaining the same level ofcompression ratios, our algorithm reduces the memory usage up to 70%.