Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Relevant Data Expansion for Learning Concept Drift from Sparsely Labeled Data
IEEE Transactions on Knowledge and Data Engineering
ACM SIGMOD Record
Tackling concept drift by temporal inductive transfer
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Quantifying trends accurately despite classifier error and class imbalance
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Suppressing model overfitting in mining concept-drifting data streams
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Estimation of Dependences Based on Empirical Data: Empirical Inference Science (Information Science and Statistics)
Transductive Methods for the Distributed Ensemble Classification Problem
Neural Computation
Boosting for transfer learning
Proceedings of the 24th international conference on Machine learning
Self-taught learning: transfer learning from unlabeled data
Proceedings of the 24th international conference on Machine learning
The class imbalance problem: A systematic study
Intelligent Data Analysis
Non-stationary data sequence classification using online class priors estimation
Pattern Recognition
Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts
The Journal of Machine Learning Research
Categorizing and mining concept drifting data streams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Divergence measures based on the Shannon entropy
IEEE Transactions on Information Theory
Recursive estimation of prior probabilities using a mixture
IEEE Transactions on Information Theory
Asymptotically efficient estimation of prior probabilities in multiclass finite mixtures
IEEE Transactions on Information Theory
Adaptive ROC-based ensembles of HMMs applied to anomaly detection
Pattern Recognition
Drift mining in data: A framework for addressing drift in classification
Computational Statistics & Data Analysis
A survey on concept drift adaptation
ACM Computing Surveys (CSUR)
Hi-index | 0.01 |
Data stream classification is a hot topic in data mining research. The great challenge is that the class priors may evolve along the data sequence. Algorithms have been proposed to estimate the dynamic class priors and adjust the classifier accordingly. However, the existing algorithms do not perform well on prior estimation due to the lack of samples from the target distribution. Sample size has great effects in parameter estimation and small-sample effects greatly contaminate the estimation performance. In this paper, we propose a novel parameter estimation method called transfer estimation. Transfer estimation makes use of samples not only from the target distribution but also from similar distributions. We apply this new estimation method to the existing algorithms and obtain an improved algorithm. Experiments on both synthetic and real data sets show that the improved algorithm outperforms the existing algorithms on both class prior estimation and classification.