Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering transactions using large items
Proceedings of the eighth international conference on Information and knowledge management
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Clustering Categorical Data: An Approach Based on Dynamical Systems
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Segmenting Customer Transactions Using a Pattern-Based Clustering Approach
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
GHIC: A Hierarchical Pattern-Based Clustering Algorithm for Grouping Web Transactions
IEEE Transactions on Knowledge and Data Engineering
TCSOM: Clustering Transactions Using Self-Organizing Map
Neural Processing Letters
Adherence clustering: an efficient method for mining market-basket clusters
Information Systems
CLUC: a natural clustering algorithm for categorical datasets based on cohesion
Proceedings of the 2006 ACM symposium on Applied computing
Projected clustering for categorical datasets
Pattern Recognition Letters
Efficiently clustering transactional data with weighted coverage density
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Enhancing the Effectiveness of Clustering with Spectra Analysis
IEEE Transactions on Knowledge and Data Engineering
Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data
IEEE Transactions on Knowledge and Data Engineering
k-ANMI: A mutual information based clustering algorithm for categorical data
Information Fusion
Efficient mining of maximal frequent itemsets from databases on a cluster of workstations
Knowledge and Information Systems
Determining the best K for clustering transactional datasets: A coverage density-based approach
Data & Knowledge Engineering
Expert Systems with Applications: An International Journal
Efficient layered density-based clustering of categorical data
Journal of Biomedical Informatics
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Adherence clustering: an efficient method for mining market-basket clusters
Information Systems
SCALE: a scalable framework for efficiently clustering transactional data
Data Mining and Knowledge Discovery
Hierarchical density-based clustering of categorical data and a simplification
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A weighted common structure based clustering technique for XML documents
Journal of Systems and Software
Discovering Knowledge-Sharing Communities in Question-Answering Forums
ACM Transactions on Knowledge Discovery from Data (TKDD)
A practical approach for clustering transaction data
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A new sequential mining approach to XML document clustering*
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Similarity search in transaction databases with a two-level bounding mechanism
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
XCLS: a fast and effective clustering algorithm for heterogenous XML documents
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Clustering and retrieval of XML documents by structure
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part II
User group profile modeling based on user transactional data for personalized systems
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
DHCC: Divisive hierarchical clustering of categorical data
Data Mining and Knowledge Discovery
From Context to Distance: Learning Dissimilarity for Categorical Data Clustering
ACM Transactions on Knowledge Discovery from Data (TKDD)
Clustering categorical data using coverage density
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
σ-SCLOPE: clustering categorical streams using attribute selection
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Clustering of heterogeneously typed data with soft computing - a case study
MICAI'11 Proceedings of the 10th international conference on Artificial Intelligence: advances in Soft Computing - Volume Part II
iTree: efficiently discovering high-coverage configurations using interaction trees
Proceedings of the 34th International Conference on Software Engineering
Bi-criteria test suite reduction by cluster analysis of execution profiles
CEE-SET'09 Proceedings of the 4th IFIP TC 2 Central and East European conference on Advances in Software Engineering Techniques
A self-organizing map for transactional data and the related categorical domain
Applied Soft Computing
Hi-index | 0.00 |
This paper studies the problem of categorical data clustering, especially for transactional data characterized by high dimensionality and large volume. Starting from a heuristic method of increasing the height-to-width ratio of the cluster histogram, we develop a novel algorithm -- CLOPE, which is very fast and scalable, while being quite effective. We demonstrate the performance of our algorithm on two real world datasets, and compare CLOPE with the state-of-art algorithms.