FREM: fast and robust EM clustering for large data sets
Proceedings of the eleventh international conference on Information and knowledge management
Clustering binary data streams with K-means
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Localized signature table: fast similarity search on transaction data
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Image-mapped data clustering: An efficient technique for clustering large data sets
Intelligent Data Analysis
Similarity search in transaction databases with a two-level bounding mechanism
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Hi-index | 0.00 |
Clustering is a data mining problem that has received significant attention by the database community. Data set size, dimensionality and sparsity have been identified as aspectsthat make clustering more difficult. This work introduces a fast algorithm to cluster large binary data sets where data points have high dimensionality and most o their coordinates are zero. This is the case with basket data transactions containing items, that can be represented as sparse binary vectors with very high dimensionality. An experimental section shows performance, advantages and limitations of the proposed approach.