BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Requirements for clustering data streams
ACM SIGKDD Explorations Newsletter
Robot Vision
A Monte Carlo algorithm for fast projective clustering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: a wavelet-based clustering approach for spatial data in very large databases
The VLDB Journal — The International Journal on Very Large Data Bases
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Subspace Selection for Clustering High-Dimensional Data
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A clustering algorithm based on matrix over high dimensional data stream
WISM'10 Proceedings of the 2010 international conference on Web information systems and mining
A grid-based subspace clustering algorithm for high-dimensional data streams
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Exclusive and complete clustering of streams
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
The three main requirements for clustering data streams on-line are one pass over the data, high processing speed, and consuming a small amount of memory. We propose an algorithm that can fulfill these requirements by introducing an incremental grid data structure to summarize the data streams on-line. In order to deal with high-dimensional problems, the algorithm adopts a simple heuristic method to select a subset of dimensions on which all the operations for clustering are performed. Our performance study with a real network intrusion detection stream data set demonstrates the efficiency and effectiveness of our proposed algorithm.