Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Requirements for clustering data streams
ACM SIGKDD Explorations Newsletter
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Clustering binary data streams with K-means
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Stream Data Management (The Kluwer International Series on Advances in Database Systems)
Stream Data Management (The Kluwer International Series on Advances in Database Systems)
A Framework for Semi-Supervised Learning Based on Subjective and Objective Clustering Criteria
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Data Streams: Models and Algorithms (Advances in Database Systems)
Data Streams: Models and Algorithms (Advances in Database Systems)
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
C-DBSCAN: Density-Based Clustering with Constraints
RSFDGrC '07 Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Measuring constraint-set utility for partitional clustering algorithms
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
An incremental data stream clustering algorithm based on dense units detection
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Density-based semi-supervised clustering
Data Mining and Knowledge Discovery
Memory-less unsupervised clustering for data streaming by versatile ellipsoidal function
Proceedings of the 20th ACM international conference on Information and knowledge management
A semi-supervised incremental clustering algorithm for streaming data
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
A density-based clustering structure mining algorithm for data streams
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Online fuzzy medoid based clustering algorithms
Neurocomputing
Hi-index | 0.00 |
Stream clustering algorithms are traditionally designed to process streams efficiently and to adapt to the evolution of the underlying population. This is done without assuming any prior knowledge about the data. However, in many cases, a certain amount of domain or background knowledge is available, and instead of simply using it for the external validation of the clustering results, this knowledge can be used to guide the clustering process. In non-stream data, domain knowledge is exploited in the context of semi-supervised clustering . In this paper, we extend the static semi-supervised learning paradigm for streams. We present C-DenStream, a density-based clustering algorithm for data streams that includes domain information in the form of constraints. We also propose a novel method for the use of background knowledge in data streams. The performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method. To our knowledge, this is the first approach to include domain knowledge in clustering for data streams.