Mapping classifiers and datasets

Authors:
Olcay Taner Yıldız
Affiliations:
Department of Computer Engineering, Işık University, 34398 İstanbul, Turkey
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 5
Cited 1

Methods for comparison

Machine learning, neural and statistical classification
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Data mining: concepts and techniques

Data mining: concepts and techniques
Complexity Measures of Supervised Classification Problems

IEEE Transactions on Pattern Analysis and Machine Intelligence
Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results

Machine Learning

Analysis of data complexity measures for classification

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	12.05

Visualization

Abstract

Given the posterior probability estimates of 14 classifiers on 38 datasets, we plot two-dimensional maps of classifiers and datasets using principal component analysis (PCA) and Isomap. The similarity between classifiers indicate correlation (or diversity) between them and can be used in deciding whether to include both in an ensemble. Similarly, datasets which are too similar need not both be used in a general comparison experiment. The results show that (i) most of the datasets (approximately two third) we used are similar to each other, (ii) multilayer perceptrons and k-nearest neighbor variants are more similar to each other than support vector machine and decision tree variants, (iii) the number of classes and the sample size has an effect on similarity.