Embedded local feature selection within mixture of experts

Authors:
Billy Peralta;Alvaro Soto
Affiliations:
-;-
Venue:
Information Sciences: an International Journal
Year:
2014

Citing 33
Cited 0

The Meta-Pi Network: Building Distributed Knowledge Representations for Robust Multisource Pattern Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
C4.5: programs for machine learning

C4.5: programs for machine learning
Hierarchical mixtures of experts and the EM algorithm

Neural Computation
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Winner-take-all mechanisms

The handbook of brain theory and neural networks
Combining predictors: comparison of five meta machine learning methods

Information Sciences: an International Journal
Information Retrieval

Information Retrieval
Convergence of a block coordinate descent method for nondifferentiable minimization

Journal of Optimization Theory and Applications
Random Forests

Machine Learning
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Mixture of experts classification using a hierarchical mixture model

Neural Computation
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Chi2: Feature Selection and Discretization of Numeric Attributes

TAI '95 Proceedings of the Seventh International Conference on Tools with Artificial Intelligence
An introduction to variable and feature selection

The Journal of Machine Learning Research
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Convex Optimization

Convex Optimization
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Hybridizing mixtures of experts with support vector machines: Investigation into nonlinear dynamic systems identification

Information Sciences: an International Journal
Penalized Model-Based Clustering with Application to Variable Selection

The Journal of Machine Learning Research
Adaptive mixtures of local experts

Neural Computation
EfficientL1regularized logistic regression

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
A system for induction of oblique decision trees

Journal of Artificial Intelligence Research
Unified video annotation via multigraph learning

IEEE Transactions on Circuits and Systems for Video Technology
A dynamic classifier ensemble selection approach for noise data

Information Sciences: an International Journal
Simultaneous feature selection and classification using kernel-penalized support vector machines

Information Sciences: an International Journal
Heterogeneous feature selection by group lasso with logistic regression

Proceedings of the international conference on Multimedia
Bayesian hierarchical mixtures of experts

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Eigenclassifiers for combining correlated classifiers

Information Sciences: an International Journal
Training regression ensembles by sequential target correction and resampling

Information Sciences: an International Journal
Ensemble Manifold Regularization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Using mutual information for selecting features in supervised neural net learning

IEEE Transactions on Neural Networks
Assemble New Object Detector With Few Examples

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.07

Visualization

Abstract

A useful strategy to deal with complex classification scenarios is the ''divide and conquer'' approach. The mixture of experts (MoE) technique makes use of this strategy by jointly training a set of classifiers, or experts, that are specialized in different regions of the input space. A global model, or gate function, complements the experts by learning a function that weighs their relevance in different parts of the input space. Local feature selection appears as an attractive alternative to improve the specialization of experts and gate function, particularly, in the case of high dimensional data. In general, subsets of dimensions, or subspaces, are usually more appropriate to classify instances located in different regions of the input space. Accordingly, this work contributes with a regularized variant of MoE that incorporates an embedded process for local feature selection using L"1 regularization. Experiments using artificial and real-world datasets provide evidence that the proposed method improves the classical MoE technique, in terms of accuracy and sparseness of the solution. Furthermore, our results indicate that the advantages of the proposed technique increase with the dimensionality of the data.