A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The nature of statistical learning theory
The nature of statistical learning theory
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
A statistical learning learning model of text classification for support vector machines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
A framework for diagnosing changes in evolving data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Online novelty detection on temporal sequences
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
On demand classification of data streams
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Precise anytime clustering of noisy sensor data with logarithmic complexity
Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data
An effective evaluation measure for clustering on evolving data streams
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Density-Based projected clustering of data streams
SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
Hi-index | 0.00 |
Data stream clustering has emerged as a challenging and interesting problem over the past few years. Due to the evolving nature, and one-pass restriction imposed by the data stream model, traditional clustering algorithms are inapplicable for stream clustering. This problem becomes even more challenging when the data is high-dimensional and the clusters are not linearly separable in the input space. In this paper, we propose a nonlinear stream clustering algorithm that adapts to the stream's evolutionary changes. Using the kernel methods for dealing with the non-linearity of data separation, we propose a novel 2-tier stream clustering architecture. Tier-1 captures the temporal locality in the stream, by partitioning it into segments, using a kernel-based novelty detection approach. Tier-2 exploits this segment structure to continuously project the streaming data nonlinearly onto a low-dimensional space (LDS), before assigning them to a cluster. We demonstrate the effectiveness of our approach through extensive experimental evaluation on various real-world datasets.