Development of a robust data mining method using CBFS and RSM

  • Authors:
  • Sangmun Shin;Yi Guo;Yongsun Choi;Myeonggil Choi;Charles Kim

  • Affiliations:
  • Department of Systems Management & Engineering, Inje University, Gimhae, South Korea;Department of Systems Management & Engineering, Inje University, Gimhae, South Korea;Department of Systems Management & Engineering, Inje University, Gimhae, South Korea;Department of Systems Management & Engineering, Inje University, Gimhae, South Korea;School of Computer Engineering, Inje University, Gimhae, South Korea

  • Venue:
  • PSI'06 Proceedings of the 6th international Andrei Ershov memorial conference on Perspectives of systems informatics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data mining (DM) has emerged as one of the key features of many applications on information system. While Data Analysis (DA) represents a significant advance in the type of analytical tools currently available, there are limitations to its capability. In order to address one of the limitations on the DA capabilities of identifying a causal relationship, we propose an integrated approach, called robust data mining (RDM), which can reduce dimensionality of the large data set, may provide detailed statistical relationships among the factors and robust factor settings. The primary objective of this paper is two-fold. First, we show how DM techniques can be effectively applied into a wastewater treatment process design by applying a correlation-based feature selection (CBFS) method. This method may be far more effective than any other methods when a large number of input factors are considered on a process design procedure. Second, we then show how DM results can be integrated into a robust design (RD) paradigm based on the selected significant factors. Our numerical example clearly shows that the proposed RDM method can efficiently find significant factors and the optimal settings by reducing dimensionality.