On data mining, compression, and Kolmogorov complexity

  • Authors:
  • Christos Faloutsos;Vasileios Megalooikonomou

  • Affiliations:
  • School of Computer Science, Carnegie Mellon University, Pittsburgh, USA 15213-3891;Department of Computer and Information Sciences, Temple University, Philadelphia, USA 19122

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Will we ever have a theory of data mining analogous to the relational algebra in databases? Why do we have so many clearly different clustering algorithms? Could data mining be automated? We show that the answer to all these questions is negative, because data mining is closely related to compression and Kolmogorov complexity; and the latter is undecidable. Therefore, data mining will always be an art, where our goal will be to find better models (patterns) that fit our datasets as best as possible.