Bump hunting in high-dimensional data
Statistics and Computing
Handling Missing Data in Trees: Surrogate Splits or Statistical Imputation
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Exploratory Data Mining and Data Cleaning
Exploratory Data Mining and Data Cleaning
Flexible patient rule induction method for optimizing process variables in discrete type
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
A data mining driven risk profiling method for road asset management
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 12.05 |
Due to the rapid development of information technologies, abundant data have become readily available. Data mining techniques have been used for process optimization in many manufacturing processes in automotive, LCD, semiconductor, and steel production, among others. However, a large amount of missing values occurs in the data set due to several causes (e.g., data discarded by gross measurement errors, measurement machine breakdown, routine maintenance, sampling inspection, and sensor failure), which frequently complicate the application of data mining to the data set. This study proposes a new procedure for optimizing processes called missing values-Patient Rule Induction Method (m-PRIM), which handles the missing-values problem systematically and yields considerable process improvement, even if a significant portion of the data set has missing values. A case study in a semiconductor manufacturing process is conducted to illustrate the proposed procedure.