Classification of weld flaws with imbalanced class data

  • Authors:
  • T. Warren Liao

  • Affiliations:
  • Industrial Engineering Department, Louisiana State University, Baton Rouge, LA 70808, United States

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 12.06

Visualization

Abstract

This paper presents research results of our investigation of the imbalanced data problem in the classification of different types of weld flaws, a multi-class classification problem. The one-against-all scheme is adopted to carry out multi-class classification and three algorithms including minimum distance, nearest neighbors, and fuzzy nearest neighbors are employed as the classifiers. The effectiveness of 22 data preprocessing methods for dealing with imbalanced data is evaluated in terms of eight evaluation criteria to determine whether any method would emerge to dominate the others. The test results indicate that: (1) nearest neighbor classifiers outperform the minimum distance classifier; (2) some data preprocessing methods do not improve any criterion and they vary from one classifier to another; (3) the combination of using the AHC_KM data preprocessing method with the 1-NN classifier is the best because they together produce the best performance in six of eight evaluation criteria; and (4) the most difficult weld flaw type to recognize is crack.