Learning in the presence of concept drift and hidden contexts
Machine Learning
Machine Learning
Experimental comparisons of online and batch versions of bagging and boosting
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Detecting Concept Drift with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Comparative Study of Cost-Sensitive Boosting Algorithms
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Improving Identification of Difficult Small Classes by Balancing Class Distribution
AIME '01 Proceedings of the 8th Conference on AI in Medicine in Europe: Artificial Intelligence Medicine
Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate decision trees for mining high-speed data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Exploratory Under-Sampling for Class-Imbalance Learning
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
STAGGER: Periodicity Mining of Data Streams Using Expanding Sliding Windows
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
The class imbalance problem: A systematic study
Intelligent Data Analysis
On Appropriate Assumptions to Mine Data Streams: Analysis and Practice
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
IEEE Transactions on Neural Networks
Learn++: an incremental learning algorithm for supervised neuralnetworks
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A Kernel-Based Two-Class Classifier for Imbalanced Data Sets
IEEE Transactions on Neural Networks
IMORL: Incremental Multiple-Object Recognition and Localization
IEEE Transactions on Neural Networks
Learning in non-stationary environments with class imbalance
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Classifier Ensemble for Imbalanced Data Stream Classification
Proceedings of the CUBE International Information Technology Conference
Weighted Online Sequential Extreme Learning Machine for Class Imbalance Learning
Neural Processing Letters
Classifying evolving data streams with partially labeled data
Intelligent Data Analysis
Hi-index | 0.01 |
Recent years have witnessed an incredibly increasing interest in the topic of stream data mining. Despite the great success having been achieved, current approaches generally assume that the class distribution of the stream data is relatively balanced. However, in applications such as network intrusion detection, credit fraud detection, spam classification, and many others, the class distribution is mostly imbalanced and the cost for misclassifying a minority example is very expensive. Concept drifts is an unavoidable issue for stream data mining research, which is even more difficult to handle when the classifier has to learn from an imbalanced data stream whose target concept keeps drifting all the time. In this article, we propose a selectively recursive approach (SERA) to deal with the problem of learning from nonstationary imbalanced data streams. By selectively absorbing the previously received minority examples into the current training data chunk and potentially assigning the sampling probabilities proportionally to the majority and minority examples, SERA can alleviate the difficulty confronted by the conventional stream data mining methods when they have to learn from the nonstationary imbalanced data streams. Experiments performed on the synthetic datasets show that compared to the existing approaches, our approach is competitive in the general assessment metrics and is capable of significantly performance improvement in predicting minority instances.