New method for instance or prototype selection using mutual information in time series prediction

Authors:
A. Guillen;L. J. Herrera;G. Rubio;H. Pomares;A. Lendasse;I. Rojas
Affiliations:
University of Granada, Department of Computer Technology and Architecture, Spain;University of Granada, Department of Computer Technology and Architecture, Spain;University of Granada, Department of Computer Technology and Architecture, Spain;University of Granada, Department of Computer Technology and Architecture, Spain;Helsinki University of Technology, Information and Computer Science Department, Finland;University of Granada, Department of Computer Technology and Architecture, Spain
Venue:
Neurocomputing
Year:
2010

Citing 18
Cited 2

Constrained topological mapping for nonparametric regression analysis

Neural Networks
Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms

International Journal of Man-Machine Studies - Special issue: symbolic problem solving in noisy and novel task environments
Intelligent Selection of Instances for Prediction Functions in LazyLearning Algorithms

Artificial Intelligence Review - Special issue on lazy learning
Reduction Techniques for Instance-BasedLearning Algorithms

Machine Learning
Pruning Improves Heuristic Search for Cost-Sensitive Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Stopping criterion for boosting based data reduction techniques: from binary to multiclass problem

The Journal of Machine Learning Research
Genetic algorithms for outlier detection and variable selection in linear regression models

Soft Computing - A Fusion of Foundations, Methodologies and Applications
A Survey of Outlier Detection Methodologies

Artificial Intelligence Review
Prototype selection for dissimilarity-based classifiers

Pattern Recognition
Output value-based initialization for radial basis function neural networks

Neural Processing Letters
Using fuzzy logic to improve a clustering technique for function approximation

Neurocomputing
Recursive prediction for long term time series forecasting using advanced models

Neurocomputing
Studying possibility in a clustering algorithm for RBFNN design for function approximation

Neural Computing and Applications
Efficient Parallel Feature Selection for Steganography Problems

IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part I: Bio-Inspired Systems: Computational and Ambient Intelligence
Parallel multiobjective memetic RBFNNs design and feature selection for function approximation problems

Neurocomputing
Non-parametric residual variance estimation in supervised learning

IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
Effective input variable selection for function approximation

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
A systematic approach to a self-generating fuzzy rule-table forfunction approximation

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Editorial: European Symposium on Times Series Prediction

Neurocomputing
Fast variable selection for memetracker phrases time series prediction

Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments

Quantified Score

Hi-index	0.01

Visualization

Abstract

The problem of selecting the patterns to be learned by any model is usually not considered by the time of designing the concrete model but as a preprocessing step. Information theory provides a robust theoretical framework for performing input variable selection thanks to the concept of mutual information. Recently the computation of the mutual information for regression tasks has been proposed so this paper presents a new application of the concept of mutual information not to select the variables but to decide which prototypes should belong to the training data set in regression problems. The proposed methodology consists in deciding if a prototype should belong to or not to the training set using as criteria the estimation of the mutual information between the variables. The novelty of the approach is to focus in prototype selection for regression problems instead of classification as the majority of the literature deals only with the last one. Other element that distinguishes this work from others is that it is not proposed as an outlier detector but as an algorithm that determines the best subset of input vectors by the time of building a model to approximate it. As the experiment section shows, this new method is able to identify a high percentage of the real data set when it is applied to highly distorted data sets.