Preprocessing enhancements to improve data mining algorithms

  • Authors:
  • Paraskevas Orfanidis;David J. Russomanno

  • Affiliations:
  • Department of Electrical and Computer Engineering, Herff College of Engineering, The University of Memphis, Memphis, TN 38152, USA.;Department of Electrical and Computer Engineering, Herff College of Engineering, The University of Memphis, Memphis, TN 38152, USA

  • Venue:
  • International Journal of Business Intelligence and Data Mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Preprocessing is often required before using clustering or other data mining algorithms to analyse multivariate data sets. The approaches discussed in this paper are enhanced implementations of a preprocess that utilises an algorithm to cluster points in a data set based upon each attribute independently, resulting in additional information about the data points with respect to each of its dimensions. Noise, data boundaries, and likely representatives of data subsets can be more easily identified, thus significantly improving the performance of subsequent clustering or data mining algorithms by combining this additional information across all dimensions and querying the results.