Characterizing the applicability of classification algorithms using meta-level learning
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Complexity Measures of Supervised Classification Problems
IEEE Transactions on Pattern Analysis and Machine Intelligence
On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach
Data Mining and Knowledge Discovery
Improved Dataset Characterisation for Meta-learning
DS '02 Proceedings of the 5th International Conference on Discovery Science
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
An analysis of how training data complexity affects the nearest neighbor classifiers
Pattern Analysis & Applications
The lack of a priori distinctions between learning algorithms
Neural Computation
Mindful: A framework for Meta-INDuctive neuro-FUzzy Learning
Information Sciences: an International Journal
KEEL: a software tool to assess evolutionary algorithms for data mining problems
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
On the Dimensions of Data Complexity through Synthetic Data Sets
Proceedings of the 2008 conference on Artificial Intelligence Research and Development: Proceedings of the 11th International Conference of the Catalan Association for Artificial Intelligence
In search of targeted-complexity problems
Proceedings of the 12th annual conference on Genetic and evolutionary computation
Feature-based dissimilarity space classification
ICPR'10 Proceedings of the 20th International conference on Recognizing patterns in signals, speech, images, and videos
The changing science of machine learning
Machine Learning
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Intelligent Systems, Design and Applications (ISDA 2009)
A fast and elitist multiobjective genetic algorithm: NSGA-II
IEEE Transactions on Evolutionary Computation
Domain of competence of XCS classifier system in complexity measurement space
IEEE Transactions on Evolutionary Computation
Model discrimination using an algorithmic information criterion
Automatica (Journal of IFAC)
Hi-index | 0.07 |
Public repositories have contributed to the maturation of experimental methodology in machine learning. Publicly available data sets have allowed researchers to empirically assess their learners and, jointly with open source machine learning software, they have favoured the emergence of comparative analyses of learners' performance over a common framework. These studies have brought standard procedures to evaluate machine learning techniques. However, current claims-such as the superiority of enhanced algorithms-are biased by unsustained assumptions made throughout some praxes. In this paper, the early steps of the methodology, which refer to data set selection, are inspected. Particularly, the exploitation of the most popular data repository in machine learning-the UCI repository-is examined. We analyse the type, complexity, and use of UCI data sets. The study recommends the design of a mindful data repository, UCI+, which should include a set of properly characterised data sets consisting of a complete and representative sample of real-world problems, enriched with artificial benchmarks. The ultimate goal of the UCI+ is to lay the foundations towards a well-supported methodology for learner assessment.