C4.5: programs for machine learning
C4.5: programs for machine learning
Theories for mutagenicity: a study in first-order and feature-based induction
Artificial Intelligence - Special volume on empirical methods
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
A vector space model for automatic indexing
Communications of the ACM
Relational instance-based learning with lists and terms
Machine Learning - Special issue on inducive logic programming
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Information Retrieval
Propositionalization approaches to relational data mining
Relational Data Mining
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Theoretical Comparison between the Gini Index and Information Gain Criteria
Annals of Mathematics and Artificial Intelligence
The Study of Dynamic Aggregation of Relational Attributes on Relational Data Mining
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Genetic programming for attribute construction in data mining
EuroGP'03 Proceedings of the 6th European conference on Genetic programming
IEEE Transactions on Pattern Analysis and Machine Intelligence
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Hi-index | 0.00 |
The importance of selecting relevant features for data modeling has been recognized already in machine learning. This paper discusses the application of an evolutionary-based feature selection method in order to generate input data for unsupervised learning in DARA (Dynamic Aggregation of Relational Attributes). The feature selection process which is based on the evolutionary algorithm is applied in order to improve the descriptive accuracy of the DARA (Dynamic Aggregation of Relational Attributes) algorithm. The DARA algorithm is designed to summarize data stored in the non-target tables by clustering them into groups, where multiple records stored in non-target tables correspond to a single record stored in a target table. This paper addresses the issue of optimizing the feature selection process to select relevant set of features for the DARA algorithm by using an evolutionary algorithm, which includes the evaluation of several scoring measures used as fitness functions to find the best set of relevant features. The results show the unsupervised learning in DARA can be improved by selecting a set of relevant features based on the specified fitness function which includes the measures of the dispersion and purity of the clusters produced.