Tracking Drifting Concepts By Minimizing Disagreements
Machine Learning - Special issue on computational learning theory
Technical opinion: comparing Java vs. C/C++ efficiency differences to interpersonal differences
Communications of the ACM
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental Learning from Noisy Data
Machine Learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Nile: A Query Processing Engine for Data Streams
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Decision Tree Evolution Using Limited Number of Labeled Data Items from Drifting Data Streams
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Relevant Data Expansion for Learning Concept Drift from Sparsely Labeled Data
IEEE Transactions on Knowledge and Data Engineering
Sequential Pattern Mining in Multiple Streams
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Clustering-training for Data Stream Mining
ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
On the Impact of Dissimilarity Measure in k-Modes Clustering Algorithm
IEEE Transactions on Pattern Analysis and Machine Intelligence
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Learning Higher Accuracy Decision Trees from Concept Drifting Data Streams
IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
A Practical Approach to Classify Evolving Data Streams: Training with Limited Amount of Labeled Data
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
New ensemble methods for evolving data streams
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Issues in evaluation of stream learning algorithms
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
OcVFDT: one-class very fast decision tree for one-class classification of data streams
Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
On classification and segmentation of massive audio data streams
Knowledge and Information Systems
Ambiguous decision trees for mining concept-drifting data streams
Pattern Recognition Letters
Improved heterogeneous distance functions
Journal of Artificial Intelligence Research
FlockStream: A Bio-Inspired Algorithm for Clustering Evolving Data Streams
ICTAI '09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence
Efficient mining of skyline objects in subspaces over data streams
Knowledge and Information Systems
Knowledge and Information Systems
A RANDOM DECISION TREE ENSEMBLE FOR MINING CONCEPT DRIFTS FROM NOISY DATA STREAMS
Applied Artificial Intelligence
TOPSIL-Miner: an efficient algorithm for mining top-K significant itemsets over data streams
Knowledge and Information Systems
A similarity-based approach for data stream classification
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
Most existing work on classification of data streams assumes that all streaming data are labeled and the class labels are immediately available. However, in real-world applications, such as credit fraud and intrusion detection, this assumption is not always valid. Thus, it is a challenge to learn from concept drifting data streams with unlabeled data. With this motivation, we propose a Semi-supervised classification algorithm for data streams with concept drifts and UNlabeled data (SUN) in this paper. In SUN, a clustering algorithm is developed from k-Modes and implemented to produce concept clusters at leaves in an incremental decision tree. In terms of deviations between history concept clusters and new ones, potential concept drifts are distinguished from noise. Extensive studies on both synthetic and real-world data demonstrate that SUN performs well compared to several state-of-the-art online supervised and semi-supervised algorithms, even when there are more than 90% unlabeled data. A conclusion is hence drawn that SUN provides a promising framework for tackling concept drifting data streams with unlabeled data.