Data set Editing by Ordered Projection

  • Authors:
  • Jesús S. Aguilar;José C. Riquelme;Miguel Toro

  • Affiliations:
  • Departamento de Lenguajes y Sistemas Informáticos, Facultad de Informática, Universidad de Sevilla, Avda. Reina Mercedes s/n. 41012 Sevilla, Spain. E-mail: {aguilar,riquelme,mtoro}@lsi.u ...;Departamento de Lenguajes y Sistemas Informáticos, Facultad de Informática, Universidad de Sevilla, Avda. Reina Mercedes s/n. 41012 Sevilla, Spain. E-mail: {aguilar,riquelme,mtoro}@lsi.u ...;Departamento de Lenguajes y Sistemas Informáticos, Facultad de Informática, Universidad de Sevilla, Avda. Reina Mercedes s/n. 41012 Sevilla, Spain. E-mail: {aguilar,riquelme,mtoro}@lsi.u ...

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new approach to data set editing. Thealgorithm (EOP: Editing by Ordered Projection) has some interestingcharacteristics: important reduction of the number of examples fromthe database; lower computational cost (O(mn \log n)) with respectto other typical algorithms due to the absence of distancecalculations; conservation of the decision boundaries, especiallyfrom the point of view of the application of axis-parallelclassifiers. The performance of EOP is analysed in two ways:percentage of reduction and classification. EOP has been comparedto IB2, ENN and SHRINK concerning the percentage of reduction andthe computational cost. In addition, we have analysed the accuracyof k-NN and C4.5 after applying the reduction techniques. Anextensive empirical study using databases with continuousattributes from the UCI repository shows that EOP is a valuablepreprocessing method for the later application of any axis-parallellearning algorithm.