Data-Driven Constructive Induction

Authors:
Eric Bloedorn;Ryszard S. Michalski
Affiliations:
-;-
Venue:
IEEE Intelligent Systems
Year:
1998

Citing 5
Cited 26

Learning Two-Tiered Descriptions of Flexible Concepts: The POSEIDON System

Machine Learning
Hypothesis-Driven Constructive Induction in AQ17-HCI: A Method and Experiments

Machine Learning - Special issue on evaluating and changing representation
The AQ17-DCI System for Data-Driven Constructive Induction and its Application to the Analysis of World Economics

ISMIS '96 Proceedings of the 9th International Symposium on Foundations of Intelligent Systems
Machine learning of user profiles: representational issues

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Generation of attributes for learning algorithms

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

LEARNABLE EVOLUTION MODEL: Evolutionary Processes Guided by Machine Learning

Machine Learning - Special issue on multistrategy learning
Feature Selection vs Theory Reformulation: A Study of Genetic Refinement of Knowledge-based Neural Networks

Machine Learning - Special issue on multistrategy learning
Application of intelligent agent technology for managerial data analysis and mining

ACM SIGMIS Database
Constructing X-of-N Attributes for Decision Tree Learning

Machine Learning
Selecting Examples for Partial Memory Learning

Machine Learning
Attribute generation based on association rules

Knowledge and Information Systems
A Genetic Programming Ecosystem

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Foundations of Designing Computational Knowledge Discovery Processes

Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Information architecture without internal theory: an inductive design process

Journal of the American Society for Information Science and Technology
Feature construction for reduction of tabular knowledge-based systems

Information Sciences—Informatics and Computer Science: An International Journal
Using genetic programming and decision trees for generating structural descriptions of four bar mechanisms

Artificial Intelligence for Engineering Design, Analysis and Manufacturing
Solving multi-instance problems with classifier ensemble based on constructive clustering

Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Reducing decision tree fragmentation through attribute value grouping: A comparative study

Intelligent Data Analysis
Iterative feature construction for improving inductive learning algorithms

Expert Systems with Applications: An International Journal
Study on the Non-expandability of DNF and Its Application to Incremental Induction

ICIRA '08 Proceedings of the First International Conference on Intelligent Robotics and Applications: Part I
A pilot study on acquiring metric temporal constraints for events

ARTE '06 Proceedings of the Workshop on Annotating and Reasoning about Time and Events
Generation of globally relevant continuous features for classification

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Parameterized semi-supervised classification based on support vector for multi-relational data

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part I
Self-adaptive two-phase support vector clustering for multi-relational data mining

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Adaptive support vector clustering for multi-relational data mining

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
A spectrum-based support vector algorithm for relational data semi-supervised classification

ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
ORG - oblique rules generator

ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
NLP-driven constructive learning for filtering an IR document stream

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Unsupervised feature construction for improving data representation and semantics

Journal of Intelligent Information Systems
A feature construction approach for genetic iterative rule learning algorithm

Journal of Computer and System Sciences
CHIRA---Convex Hull Based Iterative Algorithm of Rules Aggregation

Fundamenta Informaticae

Quantified Score

Hi-index	0.01

Visualization

Abstract

Inductive-learning algorithms are powerful tools for identifying meaningful patterns in large volumes of data, and their use is increasing in fields such as data mining and computer vision. However, conventional inductive-learning algorithms are selectiveýthey rely on existing, user-provided data to build their descriptions. Thus, data analysts must assume the important and sizeable task of determining relevant attributes. If they provide inadequate attributes for describing the training examples, the descriptions the program creates are likely to be excessively complex and inaccurate. Attributes can be inadequate for the learning task when they are weakly or indirectly relevant, conditionally relevant, or inappropriately measured. Constructive induction is a general approach for coping with inadequate attributes found in original data. It uses two intertwined searchesýone for the best representation space, the other for the best hypothesis within that spaceýto formulate a generalized description of examples. Originally, constructive induction focused on improving the representation space by generating additional task-relevant attributes. It was subsequently observed that this was only one way of modifying the space. Attribute construction is a form of representation space expansion; attribute selection and attribute value abstraction are forms of representation space destruction. Furthermore, it became clear that this improvement of the representation space by expansion and destruction could have a profound impact on the simplicity and predictive accuracy of concepts induced from that space. The better the representation space, the easier it is for the program to learn. It is thus important to not only add relevant attributes, but also to remove irrelevant ones and find a useful level of precision for the attribute values. Constructive induction methods are classified according to the information used to search for the best representation space: ý data-driven constructive induction (DCI) uses input examples, ý hypothesis-driven constructive induction (HCI) uses intermediate hypotheses, and ý knowledge-driven constructive induction (KCI) uses domain knowledge provided by an expert.In multistrategy constructive induction (MCI), two or more of these methods are used. This expanded definition of constructive induction guided our development of several constructive induction programs: AQ17-DCI, AQ17-HCI, and AQ17-MCI. These all use an AQ-type rule learning algorithm for conducting hypothesis search, hence the "AQ" prefix. Here we describe our latest methodology for the data-driven constructive induction, implemented in AQ17-DCI. Our methodology combines the AQ-15c learning algorithm with a range of operators for improving the representation space. These operators are classified into constructors and destructors. Constructors extend the representation space using attribute generation methods and destructors reduce the space using attribution selection methods and attribute abstraction. We integrated these operatorsýwhich are usually considered separatelyýinto AQ17-DCI in a synergistic fashion. We tested the method on two real-world problems: text categorization and natural scene interpretation. The power of a constructive induction approach is illustrated by an example from the "second Monk's problem" which was used in an international competition of machine-learning programs.