Algorithms for clustering data
Algorithms for clustering data
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A maximum entropy approach to natural language processing
Computational Linguistics
Scalable parallel data mining for association rules
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Fast discovery of association rules
Advances in knowledge discovery and data mining
Two algorithms for nearest-neighbor search in high dimensions
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
WebACE: a Web agent for document categorization and exploration
AGENTS '98 Proceedings of the second international conference on Autonomous agents
An information-theoretic analysis of hard and soft assignment methods for clustering
Proceedings of the NATO Advanced Study Institute on Learning in graphical models
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the most interesting rules
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Document Categorization and Query Generation on the World Wide WebUsing WebACE
Artificial Intelligence Review - Special issue on data mining on the Internet
Very fast EM-based mixture model clustering using multiresolution kd-trees
Proceedings of the 1998 conference on Advances in neural information processing systems II
Distributed and parallel knowledge discovery (workshop session) (title only)
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
Algorithms for association rule mining — a general survey and comparison
ACM SIGKDD Explorations Newsletter
Context-specific Bayesian clustering for gene expression data
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Distributed data clustering can be efficient and exact
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering Algorithms
Graphical Models: Foundations of Neural Computation
Graphical Models: Foundations of Neural Computation
Parallel Algorithms for Discovery of Association Rules
Data Mining and Knowledge Discovery
Finding Interesting Associations without Support Pruning
IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Co-clustering Documents and Words Using Bipartite Spectral GraphPartitioning
Co-clustering Documents and Words Using Bipartite Spectral GraphPartitioning
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
On combining multiple clusterings
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Meta methods for model sharing in personal information systems
ACM Transactions on Information Systems (TOIS)
A comprehensive validity index for clustering
Intelligent Data Analysis
Cluster domains in binary minimization problems
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Data clustering with size constraints
Knowledge-Based Systems
On combining multiple clusterings: an overview and a new perspective
Applied Intelligence
Automatic document organization in a p2p environment
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Tensor clustering via adaptive subspace iteration
Intelligent Data Analysis
Hi-index | 0.00 |
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. The clustering problem has been widely studied in machine learning, databases, and statistics. This paper studies the problem of clustering high dimensional data. The paper proposes an algorithm called the CoFD algorithm, which is a non-distance based clustering algorithm for high dimensional spaces. Based on the Maximum Likelihood Principle, CoFD attempts to optimize its parameter settings to maximize the likelihood between data points and the model generated by the parameters. The distributed versions of the problem, called the D-CoFD algorithms, are also proposed. Experimental results on both synthetic and real data sets show the efficiency and effectiveness of CoFD and D-CoFD algorithms.