Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A Framework for On-Demand Classification of Evolving Data Streams
IEEE Transactions on Knowledge and Data Engineering
Density-based clustering for real-time stream data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning drifting concepts: Example selection vs. example weighting
Intelligent Data Analysis
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Grid-based subspace clustering over data streams
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Tracking clusters in evolving data streams over sliding windows
Knowledge and Information Systems
A Practical Approach to Classify Evolving Data Streams: Training with Limited Amount of Labeled Data
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
A comparison of extrinsic clustering evaluation metrics based on formal constraints
Information Retrieval
Introduction to Semi-Supervised Learning
Introduction to Semi-Supervised Learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
Conventional stream mining algorithms focus on single and stand-alone mining tasks. Given the single-pass nature of data streams, it makes sense to maximize throughput by performing multiple complementary mining tasks concurrently. We investigate the potential of concurrent semi-supervised learning on data streams and propose an incremental algorithm called CSL-Stream (Concurrent Semi-supervised Learning of Data Streams) that performs clustering and classification at the same time. Experiments using common synthetic and real datasets show that CSL-Stream outperforms prominent clustering and classification algorithms (D-Stream and SmSCluster) in terms of accuracy, speed and scalability. The success of CSL-Stream paves the way for a new research direction in understanding latent commonalities among various data mining tasks in order to exploit the power of concurrent stream mining.