Learn++.MF: A random subspace approach for the missing feature problem

Authors:
Robi Polikar;Joseph DePasquale;Hussein Syed Mohammed;Gavin Brown;Ludmilla I. Kuncheva
Affiliations:
Electrical and Computer Eng., Rowan University, 201 Mullica Hill Road, Glassboro, NJ 08028, USA;Electrical and Computer Eng., Rowan University, 201 Mullica Hill Road, Glassboro, NJ 08028, USA;Electrical and Computer Eng., Rowan University, 201 Mullica Hill Road, Glassboro, NJ 08028, USA;University of Manchester, Manchester, England, UK;University of Bangor, Bangor, Wales, UK
Venue:
Pattern Recognition
Year:
2010

Citing 23
Cited 5

The Strength of Weak Learnability

Machine Learning
Original Contribution: Stacked generalization

Neural Networks
Decision Combination in Multiple Classifier Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
Hierarchical mixtures of experts and the EM algorithm

Neural Computation
Bagging predictors

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Training Algorithm with Incomplete Data for Feed-ForwardNeural Networks

Neural Processing Letters
Robust Learning with Missing Data

Machine Learning
Neural Network Ensembles

IEEE Transactions on Pattern Analysis and Machine Intelligence
Bagging and the Random Subspace Method for Redundant Feature Spaces

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
A Hybrid Neural Network System for Pattern Classification Tasks with Missing Features

IEEE Transactions on Pattern Analysis and Machine Intelligence
Using diversity of errors for selecting members of a committee classifier

Pattern Recognition
Moderate diversity for better cluster ensembles

Information Fusion
Imputation through finite Gaussian mixture models

Computational Statistics & Data Analysis
On Classification with Incomplete Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Empirical likelihood confidence intervals for differences between two datasets with missing data

Pattern Recognition Letters
Impact of imputation of missing values on classification error for discrete data

Pattern Recognition
Learn++.NC: combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes

IEEE Transactions on Neural Networks
Rough neuro-fuzzy structures for classification with missing data

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Random feature subset selection for ensemble based classification of data with missing features

MCS'07 Proceedings of the 7th international conference on Multiple classifier systems
Learn++: an incremental learning algorithm for supervised neuralnetworks

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

A probabilistic model of classifier competence for dynamic ensemble selection

Pattern Recognition
Semi-supervised classification based on random subspace dimensionality reduction

Pattern Recognition
A classifier ensemble approach for the missing feature problem

Artificial Intelligence in Medicine
Single Classifier-based Multiple Classification Scheme for weak classifiers: An experimental comparison

Expert Systems with Applications: An International Journal
A Multi-Expert System for chlorine electrolyzer monitoring

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

We introduce Learn^+^+.MF, an ensemble-of-classifiers based algorithm that employs random subspace selection to address the missing feature problem in supervised classification. Unlike most established approaches, Learn^+^+.MF does not replace missing values with estimated ones, and hence does not need specific assumptions on the underlying data distribution. Instead, it trains an ensemble of classifiers, each on a random subset of the available features. Instances with missing values are classified by the majority voting of those classifiers whose training data did not include the missing features. We show that Learn^+^+.MF can accommodate substantial amount of missing data, and with only gradual decline in performance as the amount of missing data increases. We also analyze the effect of the cardinality of the random feature subsets, and the ensemble size on algorithm performance. Finally, we discuss the conditions under which the proposed approach is most effective.