Compression and Machine Learning: A New Perspective on Feature Space Vectors

Authors:
D. Sculley;Carla E. Brodley
Affiliations:
Tufts University;Tufts University
Venue:
DCC '06 Proceedings of the Data Compression Conference
Year:
2006

Citing 0
Cited 10

Compression-based data mining of sequential data

Data Mining and Knowledge Discovery
Spam Filtering Using Statistical Data Compression Models

The Journal of Machine Learning Research
Spam filtering for short messages

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Dictionary based color image retrieval

Journal of Visual Communication and Image Representation
Email Spam Filtering: A Systematic Review

Foundations and Trends in Information Retrieval
Polylog Space Compression Is Incomparable with Lempel-Ziv and Pushdown Compression

SOFSEM '09 Proceedings of the 35th Conference on Current Trends in Theory and Practice of Computer Science
IP Covert Channel Detection

ACM Transactions on Information and System Security (TISSEC)
A very low bit-rate minimalist video encoder based on matching pursuits

CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
Compression for anti-adversarial learning

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Semantic Pattern Transformation: Applying Knowledge Discovery Processes in Heterogeneous Domains

Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of compression algorithms in machine learning tasks such as clustering and classification has appeared in a variety of fields, sometimes with the promise of reducing problems of explicit feature selection. The theoretical justification for such methods has been founded on an upper bound on Kolmogorov complexity and an idealized information space. An alternate view shows compression algorithms implicitly map strings into implicit feature space vectors, and compressionbased similarity measures compute similarity within these feature spaces. Thus, compression-based methods are not a "parameter free" magic bullet for feature selection and data representation, but are instead concrete similarity measures within defined feature spaces, and are therefore akin to explicit feature vector models used in standard machine learning algorithms. To underscore this point, we find theoretical and empirical connections between traditional machine learning vector models and compression, encouraging cross-fertilization in future work.