Scalability for clustering algorithms revisited
ACM SIGKDD Explorations Newsletter
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Online clustering of parallel data streams
Data & Knowledge Engineering
Adaptive Clustering for Multiple Evolving Streams
IEEE Transactions on Knowledge and Data Engineering
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Combining Multiple Interrelated Streams for Incremental Clustering
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
A partially dynamic clustering algorithm for data insertion and removal
DS'07 Proceedings of the 10th international conference on Discovery science
Tree induction over perennial objects
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Classification rule mining for a stream of perennial objects
RuleML'2011 Proceedings of the 5th international conference on Rule-based reasoning, programming, and applications
Hi-index | 0.00 |
We study incremental clustering of objects that grow and accumulate over time. The objects come from a multi-table stream e.g. streams of Customer and Transaction . As the Transactions stream accumulates, the Customers' profiles grow . First, we use an incremental propositionalisation to convert the multi-table stream into a single-table stream upon which we apply clustering. For this purpose, we develop an online version of K-Means algorithm that can handle these swelling objects and any new objects that arrive. The algorithm also monitors the quality of the model and performs re-clustering when it deteriorates. We evaluate our method on the PKDD Challenge 1999 dataset.