A Genetic-Based Feature Construction Method for Data Summarisation

Authors:
Rayner Alfred
Affiliations:
School of Engineering and Information Technology, Universiti Malaysia Sabah, Kota Kinabalu, Malaysia 88999
Venue:
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Year:
2008

Citing 15
Cited 1

Boolean Feature Discovery in Empirical Learning

Machine Learning
Adaptation in natural and artificial systems

Adaptation in natural and artificial systems
C4.5: programs for machine learning

C4.5: programs for machine learning
Theories for mutagenicity: a study in first-order and feature-based induction

Artificial Intelligence - Special volume on empirical methods
On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems

Theoretical Computer Science
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Constructing X-of-N Attributes for Decision Tree Learning

Machine Learning
A vector space model for automatic indexing

Communications of the ACM
An extended transformation approach to inductive logic programming

ACM Transactions on Computational Logic (TOCL) - Special issue devoted to Robert A. Kowalski
Understanding the Crucial Role of AttributeInteraction in Data Mining

Artificial Intelligence Review
Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks

Genetic Programming and Evolvable Machines
Feature Space Transformation Using Genetic Algorithms

IEEE Intelligent Systems
Effects of Different Types of New Attribute on Constructive Induction

ICTAI '96 Proceedings of the 8th International Conference on Tools with Artificial Intelligence
Cybernetics: Or Control and Communication in Animal and the Machine

Cybernetics: Or Control and Communication in Animal and the Machine
Data summarization approach to relational domain learning based on frequent pattern to support the development of decision making

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications

Discovering Knowledge from Multi-relational Data Based on Information Retrieval Theory

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The importance of input representation has been recognised already in machine learning. This paper discusses the application of genetic-based feature construction methods to generate input data for the data summarisation method called Dynamic Aggregation of Relational Attributes (DARA). Here, feature construction methods are applied in order to improve the descriptive accuracy of the DARAalgorithm. The DARAalgorithm is designed to summarise data stored in the non-target tables by clustering them into groups, where multiple records stored in non-target tables correspond to a single record stored in a target table. This paper addresses the question whether or not the descriptive accuracy of the DARAalgorithm benefits from the feature construction process. This involves solving the problem of constructing a relevant set of features for the DARAalgorithm by using a genetic-based algorithm. This work also evaluates several scoring measures used as fitness functions to find the best set of constructed features.