Scalable Hierarchical Clustering Method for Sequences of Categorical Values

Authors:
Tadeusz Morzy;Marek Wojciechowski;Maciej Zakrzewicz
Affiliations:
-;-;-
Venue:
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Year:
2001

Citing 15
Cited 3

Algorithms for clustering data

Algorithms for clustering data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
CACTUS—clustering categorical data using summaries

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining features for sequence classification

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Towards adaptive Web sites: conceptual framework and case study

WWW '99 Proceedings of the eighth international conference on World Wide Web
Clustering transactions using large items

Proceedings of the eighth international conference on Information and knowledge management
Clustering Algorithms

Clustering Algorithms
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Clustering Categorical Data: An Approach Based on Dynamical Systems

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
ROCK: A Robust Clustering Algorithm for Categorical Attributes

ICDE '99 Proceedings of the 15th International Conference on Data Engineering

Categorizing Visitors Dynamically by Fast and Robust Clustering of Access Logs

WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
Hybrid O(n √ n) clustering for sequential web usage mining

AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Using Patterns Co-occurrence Matrix for Cleaning Closed Sequential Patterns for Text Mining

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data clustering methods have many applications in the area of data mining. Traditional clustering algorithms deal with quantitative or categorical data points. However, there exist many important databases that store categorical data sequences, where significant knowledge is hidden behind sequential dependencies between the data. In this paper we introduce a problem of clustering categorical data sequences and present an efficient scalable algorithm to solve the problem. Our algorithm implements the general idea of agglomerative hierarchical clustering and uses frequently occurring subsequences as features describing data sequences. The algorithm not only discovers a set of high quality clusters containing similar data sequences but also provides descriptions of the discovered clusters.