The imbalanced problem in morphological galaxy classification

  • Authors:
  • Jorge De la Calleja;Gladis Huerta;Olac Fuentes;Antonio Benitez;Eduardo López Domínguez;Ma. Auxilio Medina

  • Affiliations:
  • Ingeniería en Informática, Universidad Politécnica de Puebla, Puebla, México;Ingeniería en Informática, Universidad Politécnica de Puebla, Puebla, México;Computer Science Department, University of Texas at El Paso, Texas;Ingeniería en Informática, Universidad Politécnica de Puebla, Puebla, México;Ingeniería en Informática, Universidad Politécnica de Puebla, Puebla, México;Ingeniería en Informática, Universidad Politécnica de Puebla, Puebla, México

  • Venue:
  • CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present an experimental study of the performance of six machine learning algorithms applied to morphological galaxy classification. We also address the learning approach from imbalanced data sets, inherent to many real-world applications, such as astronomical data analysis problems. We used two over-sampling techniques: SMOTE and Resampling, and we vary the amount of generated instances for classification. Our experimental results show that the learning method Random Forest with Resampling obtain the best results for three, five and seven galaxy types, with a F-measure about. 99 for all cases.