A theoretical framework for data mining: the "informational paradigm"

  • Authors:
  • Renato Coppi

  • Affiliations:
  • Department of Statistics, Probability and Applied Statistics, University of Rome "La Sapienza", p. le Aldo Moro 5, I-00185 Roma, Italy

  • Venue:
  • Computational Statistics & Data Analysis - Nonlinear methods and data mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data Mining (DM) is examined in a statistical perspective, as a methodological area where the objective is to extract useful information from very large databases. It is underlined that DM, as it presently stands, lacks sound theoretical foundations. The main statistical paradigms are briefly reviewed and evaluated with reference to the practice of DM. It is argued that they are insufficient for providing a consistent background to DM activities. The "informational" paradigm is illustrated in general. Some issues concerning design and analysis aspects in DM are discussed within this paradigm. A few examples are illustrated, with reference to the problems of finding association rules in the database, and of setting up appropriate classification procedures.