Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Handling concept drifts in incremental learning with support vector machines
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Incremental Support Vector Machine Construction
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Regularized multi--task learning
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Combining proactive and reactive predictions for data streams
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A martingale framework for concept change detection in time-varying data streams
ICML '05 Proceedings of the 22nd international conference on Machine learning
Using additive expert ensembles to cope with concept drift
ICML '05 Proceedings of the 22nd international conference on Machine learning
Data Streams: Models and Algorithms (Advances in Database Systems)
Data Streams: Models and Algorithms (Advances in Database Systems)
The Journal of Machine Learning Research
Boosting for transfer learning
Proceedings of the 24th international conference on Machine learning
Frequent pattern mining: current status and future directions
Data Mining and Knowledge Discovery
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Detecting change in data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Optimization Techniques for Semi-Supervised Support Vector Machines
The Journal of Machine Learning Research
Categorizing and mining concept drifting data streams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Active Learning from Data Streams
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
On Appropriate Assumptions to Mine Data Streams: Analysis and Practice
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
A Practical Approach to Classify Evolving Data Streams: Training with Limited Amount of Labeled Data
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
An Aggregate Ensemble for Mining Concept Drifting Data Streams with Noise
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Multiple information sources cooperative learning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Mining Data Streams with Labeled and Unlabeled Training Examples
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Vague One-Class Learning for Data Streams
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Semi-Supervised Learning
SKIF: a data imputation framework for concept drifting data streams
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Robust ensemble learning for mining noisy data streams
Decision Support Systems
Active learning from stream data using optimal weight classifier ensemble
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Classifier and Cluster Ensembles for Mining Concept Drifting Data Streams
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Enabling fast prediction for ensemble models on data streams
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.01 |
Data stream classification has drawn increasing attention from the data mining community in recent years. Relevant applications include network traffic monitoring, sensor network data analysis, Web click stream mining, power consumption measurement, dynamic tracing of stock fluctuations, to name a few. Data stream classification in such real-world applications is typically subject to three major challenges: concept drifting, large volumes, and partial labeling. As a result, training examples in data streams can be very diverse and it is very hard to learn accurate models with efficiency. In this paper, we propose a novel framework that first categorizes diverse training examples into four types and assign learning priorities to them. Then, we derive four learning cases based on the proportion and priority of the different types of training examples. Finally, for each learning case, we employ one of the four SVM-based training models: classical SVM, semi-supervised SVM, transfer semi-supervised SVM, and relational k-means transfer semi-supervised SVM. We perform comprehensive experiments on real-world data streams that validate the utility of our approach.