Unsupervised learning of mutagenesis molecules structure based on an evolutionary-based features selection in DARA

Authors:
Rayner Alfred;Irwansah Amran;Leau Yu Beng;Tan Soo Fun
Affiliations:
School of Engineering and Information Technology, Universiti Malaysia Sabah, Jalan UMS, Kota Kinabalu, Sabah, Malaysia;School of Engineering and Information Technology, Universiti Malaysia Sabah, Jalan UMS, Kota Kinabalu, Sabah, Malaysia;School of Engineering and Information Technology, Universiti Malaysia Sabah, Jalan UMS, Kota Kinabalu, Sabah, Malaysia;School of Engineering and Information Technology, Universiti Malaysia Sabah, Jalan UMS, Kota Kinabalu, Sabah, Malaysia
Venue:
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Year:
2012

Citing 14
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Theories for mutagenicity: a study in first-order and feature-based induction

Artificial Intelligence - Special volume on empirical methods
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
A vector space model for automatic indexing

Communications of the ACM
Relational instance-based learning with lists and terms

Machine Learning - Special issue on inducive logic programming
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Information Retrieval

Information Retrieval
Propositionalization approaches to relational data mining

Relational Data Mining
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Theoretical Comparison between the Gini Index and Information Gain Criteria

Annals of Mathematics and Artificial Intelligence
The Study of Dynamic Aggregation of Relational Attributes on Relational Data Mining

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Genetic programming for attribute construction in data mining

EuroGP'03 Proceedings of the 6th European conference on Genetic programming
A Cluster Separation Measure

IEEE Transactions on Pattern Analysis and Machine Intelligence
Data summarization approach to relational domain learning based on frequent pattern to support the development of decision making

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The importance of selecting relevant features for data modeling has been recognized already in machine learning. This paper discusses the application of an evolutionary-based feature selection method in order to generate input data for unsupervised learning in DARA (Dynamic Aggregation of Relational Attributes). The feature selection process which is based on the evolutionary algorithm is applied in order to improve the descriptive accuracy of the DARA (Dynamic Aggregation of Relational Attributes) algorithm. The DARA algorithm is designed to summarize data stored in the non-target tables by clustering them into groups, where multiple records stored in non-target tables correspond to a single record stored in a target table. This paper addresses the issue of optimizing the feature selection process to select relevant set of features for the DARA algorithm by using an evolutionary algorithm, which includes the evaluation of several scoring measures used as fitness functions to find the best set of relevant features. The results show the unsupervised learning in DARA can be improved by selecting a set of relevant features based on the specified fitness function which includes the measures of the dispersion and purity of the clusters produced.