Detecting feature interactions from accuracies of random feature subsets

Authors:
Thomas R. Ioerger
Affiliations:
-
Venue:
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Year:
1999

Citing 8
Cited 2

Simplifying decision trees

International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
Feature construction: an analytic framework and an application to decision trees

Feature construction: an analytic framework and an application to decision trees
Learning hard concepts through constructive induction: framework and rationale

Computational Intelligence
A practical approach to feature selection

ML92 Proceedings of the ninth international workshop on Machine learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Bootstrapping training-data representations for inductive learning: a case study in molecular biology

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Induction of Decision Trees

Machine Learning

Weighting Features to Recognize 3D Patterns of Electron Density in X-Ray Protein Crystallography

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
TEXTAL™: automated crystallographic protein structure determination

IAAI'05 Proceedings of the 17th conference on Innovative applications of artificial intelligence - Volume 3

Quantified Score

Hi-index	0.00

Visualization

Abstract

Interaction among features notoriously causes difficulty for machine learning algorithms because the relevance of one feature for predicting the target class can depend on the values of other features. In this paper, we introduce a new method for detecting feature interactions by evaluating the accuracies of a learning algorithm on random subsets of features. We give an operational definition for feature interactions based on when a set of features allows a learning algorithm to achieve higher than expected accuracy, assuming independence. Then we show how to adjust the sampling of random subsets in a way that is fair and balanced, given a limited amount of time. Finally, we show how decision trees built from sets of interacting features can be converted into DNF expressions to form constructed features. We demonstrate the effectiveness of the method empirically by showing that it can improve the accuracy of the C4.5 decision-tree algorithm on several benchmark databases.