Toward data mining engineering: A software engineering approach

  • Authors:
  • Oscar Marbán;Javier Segovia;Ernestina Menasalvas;Covadonga Fernández-Baizán

  • Affiliations:
  • Facultad de Informática, Universidad Politécnica de Madrid (U.P.M.), Spain;Facultad de Informática, Universidad Politécnica de Madrid (U.P.M.), Spain;Facultad de Informática, Universidad Politécnica de Madrid (U.P.M.), Spain;Facultad de Informática, Universidad Politécnica de Madrid (U.P.M.), Spain

  • Venue:
  • Information Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The number, variety and complexity of projects involving data mining or knowledge discovery in databases activities have increased just lately at such a pace that aspects related to their development process need to be standardized for results to be integrated, reused and interchanged in the future. Data mining projects are quickly becoming engineering projects, and current standard processes, like CRISP-DM, need to be revisited to incorporate this engineering viewpoint. This is the central motivation of this paper that makes the point that experience gained about the software development process over almost 40 years could be reused and integrated to improve data mining processes. Consequently, this paper proposes to reuse ideas and concepts underlying the IEEE Std 1074 and ISO 12207 software engineering model processes to redefine and add to the CRISP-DM process and make it a data mining engineering standard.