Dimension reduction techniques and the classification of bent double galaxies

  • Authors:
  • Imola K. Fodor;Chandrika Kamath

  • Affiliations:
  • Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, P.O. BOX 808 L-560, Livermore, CA 94551, USA;Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, P.O. BOX 808 L-560, Livermore, CA 94551, USA

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2002

Quantified Score

Hi-index 0.03

Visualization

Abstract

As data mining gains acceptance in the analysis of massive data sets, it is becoming clear that there is a need for algorithms that can handle not only the massive size, but also the high dimensionality of the data. Certain pattern recognition algorithms can become computationally intractable when the number of features reaches hundreds or even thousands, while others can break down if there are large correlations among the features. A common solution to these problems is to reduce the dimension, either in conjunction with the pattern recognition algorithm or independent of it. We describe how dimension reduction techniques can be applied in the context of a specific data mining application, namely, the classification of radio-galaxies with a bent double morphology. We discuss certain statistical and exploratory data analysis methods to reduce the number of features, and the subsequent improvements in the performance of decision tree and generalized linear model classifiers. We show that a careful extraction and selection of features is necessary for the successful application of data mining techniques.