BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
From data mining to knowledge discovery: an overview
Advances in knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
Scalability for clustering algorithms revisited
ACM SIGKDD Explorations Newsletter
Multidimensional binary search trees used for associative searching
Communications of the ACM
Requirements for clustering data streams
ACM SIGKDD Explorations Newsletter
Approximate clustering via core-sets
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
BIRCH: A New Data Clustering Algorithm and Its Applications
Data Mining and Knowledge Discovery
Refining Initial Points for K-Means Clustering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Better streaming algorithms for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
A framework for diagnosing changes in evolving data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
On coresets for k-means and k-median clustering
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Approximating extent measures of points
Journal of the ACM (JACM)
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Duplicate detection in click streams
WWW '05 Proceedings of the 14th international conference on World Wide Web
Research issues in data stream association rule mining
ACM SIGMOD Record
How slow is the k-means method?
Proceedings of the twenty-second annual symposium on Computational geometry
Discretization from data streams: applications to histograms and data mining
Proceedings of the 2006 ACM symposium on Applied computing
MONIC: modeling and monitoring cluster transitions
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
YALE: rapid prototyping for complex data mining tasks
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Semantic Smoothing for Model-based Document Clustering
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Streams: Models and Algorithms (Advances in Database Systems)
Data Streams: Models and Algorithms (Advances in Database Systems)
Introduction to Clustering Large and High-Dimensional Data
Introduction to Clustering Large and High-Dimensional Data
Unsupervised Clustering In Streaming Data
ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
HClustream: A Novel Approach for Clustering Evolving Heterogeneous Data Stream
ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Cell trees: An adaptive synopsis structure for clustering multi-dimensional on-line data streams
Data & Knowledge Engineering
Density-based clustering for real-time stream data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Data Clustering: Theory, Algorithms, and Applications (ASA-SIAM Series on Statistics and Applied Probability)
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Top 10 algorithms in data mining
Knowledge and Information Systems
Learning from Data Streams: Processing Techniques in Sensor Networks
Learning from Data Streams: Processing Techniques in Sensor Networks
Tracking clusters in evolving data streams over sliding windows
Knowledge and Information Systems
Hierarchical Clustering of Time-Series Data Streams
IEEE Transactions on Knowledge and Data Engineering
Continuous Trend-Based Clustering in Data Streams
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
A Weighted Fuzzy Clustering Algorithm for Data Stream
CCCM '08 Proceedings of the 2008 ISECS International Colloquium on Computing, Communication, Control, and Management - Volume 01
Incremental clustering of dynamic data streams using connectivity based representative points
Data & Knowledge Engineering
Clustering
Tight results for clustering and summarizing data streams
Proceedings of the 12th International Conference on Database Theory
An EM-Based Algorithm for Clustering Data Streams in Sliding Windows
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
k-means requires exponentially many iterations even in the plane
Proceedings of the twenty-fifth annual symposium on Computational geometry
A Framework for Clustering Uncertain Data Streams
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Multi-scale Real-Time Grid Monitoring with Job Stream Mining
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Clustering data stream: A survey of algorithms
International Journal of Knowledge-based and Intelligent Engineering Systems
Density-Based Data Streams Clustering over Sliding Windows
FSKD '09 Proceedings of the 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 05
Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
Proceedings of the 2010 conference on Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
A detailed analysis of the KDD CUP 99 data set
CISDA'09 Proceedings of the Second IEEE international conference on Computational intelligence for security and defense applications
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Clustering of Evolving Data Stream with Multiple Adaptive Sliding Window
DSDE '10 Proceedings of the 2010 International Conference on Data Storage and Data Engineering
Knowledge Discovery from Data Streams
Knowledge Discovery from Data Streams
The Journal of Machine Learning Research
Clustering distributed sensor data streams using local processing and reduced communication
Intelligent Data Analysis - Ubiquitous Knowledge Discovery
MEC --Monitoring Clusters' Transitions
Proceedings of the 2010 conference on STAIRS 2010: Proceedings of the Fifth Starting AI Researchers' Symposium
Self-adaptive change detection in streaming data with non-stationary distribution
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
An effective evaluation measure for clustering on evolving data streams
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
The ClusTree: indexing micro-clusters for anytime stream mining
Knowledge and Information Systems
A segment-based framework for modeling and mining data streams
Knowledge and Information Systems
DCF: an efficient data stream clustering framework for streaming applications
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Proceedings of the Second international conference on Knowledge Discovery from Sensor Data
Sensor-KDD'08 Proceedings of the Second international conference on Knowledge Discovery from Sensor Data
StreamKM++: A clustering algorithm for data streams
Journal of Experimental Algorithmics (JEA)
Continuously identifying representatives out of massive streams
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
International Journal of Approximate Reasoning
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Least squares quantization in PCM
IEEE Transactions on Information Theory
A clustering approach for sampling data streams in sensor networks
Knowledge and Information Systems
Extending k-Means-Based Algorithms for Evolving Data Streams with Variable Number of Clusters
ICMLA '11 Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 02
SOStream: self organizing density-based clustering over data stream
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Light-weight Online Predictive Data Aggregation for Wireless Sensor Networks
Proceedings of Workshop on Machine Learning for Sensory Data Analysis
Hi-index | 0.00 |
Data stream mining is an active research area that has recently emerged to discover knowledge from large amounts of continuously generated data. In this context, several data stream clustering algorithms have been proposed to perform unsupervised learning. Nevertheless, data stream clustering imposes several challenges to be addressed, such as dealing with nonstationary, unbounded data that arrive in an online fashion. The intrinsic nature of stream data requires the development of algorithms capable of performing fast and incremental processing of data objects, suitably addressing time and memory limitations. In this article, we present a survey of data stream clustering algorithms, providing a thorough discussion of the main design components of state-of-the-art algorithms. In addition, this work addresses the temporal aspects involved in data stream clustering, and presents an overview of the usually employed experimental methodologies. A number of references are provided that describe applications of data stream clustering in different domains, such as network intrusion detection, sensor networks, and stock market analysis. Information regarding software packages and data repositories are also available for helping researchers and practitioners. Finally, some important issues and open questions that can be subject of future research are discussed.