Self-organization and associative memory: 3rd edition
Self-organization and associative memory: 3rd edition
Computational learning theory: an introduction
Computational learning theory: an introduction
Topology representing networks
Neural Networks
Kernel-based equiprobabilistic topographic map formation
Neural Computation
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Mining: Predictive Methods for Analyzing Unstructured Information
Text Mining: Predictive Methods for Analyzing Unstructured Information
Relative information of type s, Csiszár's f-divergence, and information inequalities
Information Sciences—Informatics and Computer Science: An International Journal
Clustering For Data Mining: A Data Recovery Approach (Chapman & Hall/Crc Computer Science)
Clustering For Data Mining: A Data Recovery Approach (Chapman & Hall/Crc Computer Science)
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
An Experimental Study on Pedestrian Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Data mining approaches for intrusion detection
SSYM'98 Proceedings of the 7th conference on USENIX Security Symposium - Volume 7
An adaptive personalized news dissemination system
Journal of Intelligent Information Systems
Circular backpropagation networks embed vector quantization
IEEE Transactions on Neural Networks
K-winner machines for pattern classification
IEEE Transactions on Neural Networks
Empirical measure of multiclass generalization performance: the K-winner machine case
IEEE Transactions on Neural Networks
Just-in-Time Adaptive Classifiers—Part I: Detecting Nonstationary Changes
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Data-intensive applications use empirical methods to extract consistent information from huge samples. When applied to classification tasks, their aim is to optimize accuracy on unseen data hence a reliable prediction of the generalization error is of paramount importance. Theoretical models, such as Statistical Learning Theory, and empirical estimations, such as cross-validation, can both fit data-mining classification domains very well, provided some crucial assumptions are verified in advance. In particular, the stationary distribution of the observed data is critical, although it is sometimes overlooked in practice. The paper formulates an operative criterion to verify the stationary assumption; the method applies to both theoretical and practical predictions of generalization errors. The analysis addresses the specific case of clustering-based classifiers; the K-Winner Machine (KWM) model is used as a reference for its known theoretical bounds; cross-validation provides an empirical counterpart for practical comparison. The criterion, based on efficient unsupervised clustering-based probability distribution estimation, is tested experimentally on a set of different, data-intensive applications, including: intrusion detection for computer-network security, optical character recognition, text mining and pedestrian detection. Experimental results confirm the effectiveness of the proposed approach to efficiently detect non stationarity.