Machine Learning
Crafting Papers on Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Knowledge Discovery in Multi-label Phenotype Data
PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
An Interval Classifier for Database Mining Applications
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
An Empirical Study of Lazy Multilabel Classification Algorithms
SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Feature selection for multi-label naive Bayes classification
Information Sciences: an International Journal
MMDT: a multi-valued and multi-labeled decision tree classifier for data mining
Expert Systems with Applications: An International Journal
MULAN: A Java Library for Multi-Label Learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
A controlled environment based on known properties of the dataset used by a learning algorithm is useful to empirically evaluate machine learning algorithms. Synthetic (artificial) datasets are used for this purpose. Although there are publicly available frameworks to generate synthetic single-label datasets, this is not the case for multi-label datasets, in which each instance is associated with a set of labels usually correlated. This work presents Mldatagen, a multi-label dataset generator framework we have implemented, which is publicly available to the community. Currently, two strategies have been implemented in Mldatagen: hypersphere and hypercube. For each label in the multi-label dataset, these strategies randomly generate a geometric shape (hypersphere or hypercube), which is populated with points (instances) randomly generated. Afterwards, each instance is labeled according to the shapes it belongs to, which defines its multi-label. Experiments with a multi-label classification algorithm in six synthetic datasets illustrate the use of Mldatagen.