Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
The "Best K" for entropy-based categorical data clustering
SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Adaptive Clustering for Multiple Evolving Streams
IEEE Transactions on Knowledge and Data Engineering
Rough Set-Based Clustering with Refinement Using Shannon's Entropy Theory
Computers & Mathematics with Applications
Evolutionary spectral clustering by incorporating temporal smoothness
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
MMR: An algorithm for clustering categorical data using Rough Set Theory
Data & Knowledge Engineering
Clustering over Multiple Evolving Streams by Events and Correlations
IEEE Transactions on Knowledge and Data Engineering
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A Web Usage Mining Framework for Mining Evolving User Profiles in Dynamic Web Sites
IEEE Transactions on Knowledge and Data Engineering
A new measure of uncertainty based on knowledge granulation for rough sets
Information Sciences: an International Journal
Some issues about outlier detection in rough set theory
Expert Systems with Applications: An International Journal
The Development of Fuzzy Rough Sets with the Use of Structures and Algebras of Axiomatic Fuzzy Sets
IEEE Transactions on Knowledge and Data Engineering
Catching the Trend: A Framework for Clustering Concept-Drifting Categorical Data
IEEE Transactions on Knowledge and Data Engineering
A new initialization method for categorical data clustering
Expert Systems with Applications: An International Journal
An initialization method for the K-Means algorithm using neighborhood model
Computers & Mathematics with Applications
A new extension of fuzzy sets using rough sets: R-fuzzy sets
Information Sciences: an International Journal
New approaches to fuzzy-rough feature selection
IEEE Transactions on Fuzzy Systems
Clustering of time series data-a survey
Pattern Recognition
Positive approximation: An accelerator for attribute reduction in rough set theory
Artificial Intelligence
Some new indexes of cluster validity
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Attributes Reduction Using Fuzzy Rough Sets
IEEE Transactions on Fuzzy Systems
On cluster validity for the fuzzy c-means model
IEEE Transactions on Fuzzy Systems
A clustering algorithm for multiple data streams based on spectral component similarity
Information Sciences: an International Journal
A dissimilarity measure for the k-Modes clustering algorithm
Knowledge-Based Systems
Determining the number of clusters using information entropy for mixed data
Pattern Recognition
A novel fuzzy clustering algorithm with between-cluster information for categorical data
Fuzzy Sets and Systems
Hi-index | 0.00 |
A fundamental assumption often made in unsupervised learning is that the problem is static, i.e., the description of the classes does not change with time. However, many practical clustering tasks involve changing environments. It is hence recognized that the methods and techniques to analyze the evolving trends for changing environments are of increasing interest and importance. Although the problem of clustering numerical time-evolving data is well-explored, the problem of clustering categorical time-evolving data remains as a challenging issue. In this paper, we propose a generalized clustering framework for categorical time-evolving data, which is composed of three algorithms: a drifting-concept detecting algorithm that detects the difference between the current sliding window and the last sliding window, a data-labeling algorithm that decides the most-appropriate cluster label for each object of the current sliding window based on the clustering results of the last sliding window, and a cluster-relationship-analysis algorithm that analyzes the relationship between clustering results at different time stamps. The time-complexity analysis indicates that these proposed algorithms are effective for large datasets. Experiments on a real dataset show that the proposed framework not only accurately detects the drifting concepts but also attains clustering results of better quality. Furthermore, compared with the other framework, the proposed one needs fewer parameters, which is favorable for specific applications.