Handling concept drift via ensemble and class distribution estimation technique

Authors:
Nachai Limsetto;Kitsana Waiyamai
Affiliations:
Data Analysis and Knowledge Discovery Lab (DAKDL) Department of computer engineering Faculty of Engineering, Kasetsart University, Bangkok, Thailand;Data Analysis and Knowledge Discovery Lab (DAKDL) Department of computer engineering Faculty of Engineering, Kasetsart University, Bangkok, Thailand
Venue:
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Year:
2011

Citing 18
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Robust Classification for Imprecise Environments

Machine Learning
Learning and making decisions when costs and probabilities are both unknown

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure

Neural Computation
Adjusting the Outputs of a Classifier to New a Priori Probabilities May Significantly Improve Classification Accuracy: Evidence from a multi-class problem in remote sensing

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Mining with rarity: a unifying framework

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Minimax Regret Classifier for Imprecise Class Distributions

The Journal of Machine Learning Research
Quantifying counts and costs via classification

Data Mining and Knowledge Discovery
Quantification and semi-supervised classification methods for handling changes in class distribution

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction

Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Ensemble-based classifiers

Artificial Intelligence Review
Assessing the impact of changing environments on classifier performance

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Semi-Supervised Learning

Semi-Supervised Learning
Counting positives accurately despite inaccurate classification

ECML'05 Proceedings of the 16th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In real world settings there is situation where class distribution of data may change after classifier is built resulting in performance degradation of classifier. Attempts to solve this problem from previous Class Distribution Estimation method (CDE method) yield quite interesting performance however we notice there is some flaw since CDE method still have some bias toward train data thus we decide to improve them with ensemble method. Our Class Distribution Estimation-Ensemble (CDE-EM) methods estimate class distribution from many models instead of one resulting in less bias than previous method. All methods are evaluated using accuracy on set of benchmark UCI data sets. Experimental results demonstrate that our methods yield better performance if class distribution of test data is different from train data.