A framework for clustering categorical time-evolving data
IEEE Transactions on Fuzzy Systems
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Exploiting online social data in ontology learning for event tracking and emergency response
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 0.00 |
Although the problem of clustering numerical time-evolving data is well-explored, the problem of clustering categorical time-evolving data remains as a challenge issue. In this paper, we propose a generalized clustering framework which utilizes existing clustering algorithms and adopts sliding window technique to detect if there is a drifting-concept or not in the incoming sliding window. The framework is composed of two algorithms: Drifting Concept Detecting (abbreviated as DCD) algorithm detecting the changes of cluster distributions between the current sliding window and the last clustering result, and Cluster Relationship Analysis (abbreviated as CRA) algorithm analyzing the relationship between clustering results at different time. In DCD, the concept is said to drift if quite a large number of outliers are found in the current sliding window, or if quite a large number of clusters are varied in the ratio of data points. The drifted sliding window will perform re-clustering to capture the recent concept. In CRA, a visualizing method is devised to facilitate the observation of the evolving clustering results. The framework is validated on real and synthetic data sets, and is shown to not only accurately detect the drifting-concepts but also attain clustering results of better quality.