IEEE Transactions on Pattern Analysis and Machine Intelligence
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
The UCI KDD archive of large data sets for data mining research and experimentation
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed Data Mining in Credit Card Fraud Detection
IEEE Intelligent Systems
Applying One-Sided Selection to Unbalanced Datasets
MICAI '00 Proceedings of the Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate decision trees for mining high-speed data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
A multi-objective optimisation approach for class imbalance learning
Pattern Recognition
Hi-index | 0.00 |
Nowadays, many applications need to handle large amounts of streaming data, which often presents a skewed distribution, i.e. one or more classes are largely under-represented in comparison to the others. Unfortunately, little effort has been directed towards the classification of skewed data streams, although class-imbalance learning has already been studied in the area of pattern recognition on static data. Furthermore, while existing class-imbalance learning methods increase the recognition accuracy on minority class, they often harm the global classification accuracy. Motivated by these observations, we develop an approach suited for classifying skewed data streams, which integrates two ensembles of classifiers, each one suited for non-skewed and skewed data. This approach substantially increases the global accuracy compared to existing classification methods for skewed data. Experimental tests have been carried out on three public datasets showing interesting results. As a further contribution, we will study metrics to evaluate the performance of skewed data streams classification. We will also review the literature on class-imbalance learning, and skewed data streams classification.