Pattern-Oriented Hierachical Clustering

Authors:
Tadeusz Morzy;Marek Wojciechowski;Maciej Zakrzewicz
Affiliations:
-;-;-
Venue:
ADBIS '99 Proceedings of the Third East European Conference on Advances in Databases and Information Systems
Year:
1999

Citing 6
Cited 4

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Clustering Categorical Data: An Approach Based on Dynamical Systems

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases

Multi-objective phylogenetic algorithm: solving multi-objective decomposable deceptive problems

EMO'11 Proceedings of the 6th international conference on Evolutionary multi-criterion optimization
PrefixUnion: mining traversal patterns efficiently in virtual environments

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part III
An improvement algorithm for accessing patterns through clustering in interactive VRML environments

PCM'04 Proceedings of the 5th Pacific Rim conference on Advances in Multimedia Information Processing - Volume Part III
S2MP: similarity measure for sequential patterns

AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering is a data mining method, which consists in discovering interesting data distributions in very large databases. The applications of clustering cover customer segmentation, catalog design, store layout, stock market segmentation, etc. In this paper, we consider the problem of discovering similarity-based clusters in a large database of event sequences. We introduce a hierarchical algorithm that uses sequential patterns found in the database to efficiently generate both the clustering model and data clusters. The algorithm iteratively merges smaller, similar clusters into bigger ones until the requested number of clusters is reached. In the absence of a well-defined metric space, we propose the similarity measure, which is used in cluster merging. The advantage of the proposed measure is that no additional access to the source database is needed to evaluate the inter-cluster similarities.