Algorithms for clustering data
Algorithms for clustering data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
WebACE: a Web agent for document categorization and exploration
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Document Categorization and Query Generation on the World Wide WebUsing WebACE
Artificial Intelligence Review - Special issue on data mining on the Internet
Document clustering using word clusters via the information bottleneck method
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Concept decompositions for large sparse text data using clustering
Machine Learning
Clustering Algorithms
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
Locally Adaptive Metric Nearest-Neighbor Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
CoFD: An Algorithm for Non-distance Based Clustering in High Dimensional Spaces
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Adaptive dimension reduction for clustering high dimensional data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering by concept factorization
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Entropy-based criterion in categorical clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Unified View on Clustering Binary Data
Machine Learning
Spectral clustering for multi-type relational data
ICML '06 Proceedings of the 23rd international conference on Machine learning
Orthogonal nonnegative matrix t-factorizations for clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised learning on k-partite graphs
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Relational clustering by symmetric convex coding
Proceedings of the 24th international conference on Machine learning
A probabilistic framework for relational clustering
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Networkmd: topology inference and failure diagnosis in the last mile
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Computational Statistics & Data Analysis
CRD: fast co-clustering on large datasets utilizing sampling-based matrix decomposition
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Knowledge transformation from word space to document space
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Succinct summarization of transactional databases: an overlapped hyperrectangle scheme
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering based on matrix approximation: a unifying view
Knowledge and Information Systems
Mining non-redundant high order correlations in binary data
Proceedings of the VLDB Endowment
A Statistical Approach for Binary Vectors Modeling and Clustering
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Online pairing of VoIP conversations
The VLDB Journal — The International Journal on Very Large Data Bases
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
On multivariate binary data clustering and feature weighting
Computational Statistics & Data Analysis
Binary matrix factorization for analyzing gene expression data
Data Mining and Knowledge Discovery
A hybrid multicast-unicast infrastructure for efficient publish-subscribe in enterprise networks
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
On combining multiple clusterings: an overview and a new perspective
Applied Intelligence
Discovering Knowledge-Sharing Communities in Question-Answering Forums
ACM Transactions on Knowledge Discovery from Data (TKDD)
A Clustering-Driven LDAP Framework
ACM Transactions on the Web (TWEB)
Integrating Document Clustering and Multidocument Summarization
ACM Transactions on Knowledge Discovery from Data (TKDD)
Summarizing transactional databases with overlapped hyperrectangles
Data Mining and Knowledge Discovery
A practical approach for clustering transaction data
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Detecting communities in K-partite K-uniform (hyper)networks
Journal of Computer Science and Technology - Special issue on Community Analysis and Information Recommendation
Multi-Label Classification Method for Multimedia Tagging
International Journal of Multimedia Data Engineering & Management
Hi-index | 0.00 |
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This paper studies the problem of clustering binary data. This is the case for market basket datasets where the transactions contain items and for document datasets where the documents contain "bag of words". The contribution of the paper is three-fold. First a general binary data clustering model is presented. The model treats the data and features equally, based on their symmetric association relations, and explicitly describes the data assignments as well as feature assignments. We characterize several variations with different optimization procedures for the general model. Second, we also establish the connections between our clustering model with other existing clustering methods. Third, we also discuss the problem for determining the number of clusters for binary clustering. Experimental results show the effectiveness of the proposed clustering model.