Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Proceedings of the ninth international conference on Information and knowledge management
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bipartite graph partitioning and data clustering
Proceedings of the tenth international conference on Information and knowledge management
Document clustering with cluster refinement and model selection capabilities
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Concept Decompositions for Large Sparse Text Data Using Clustering
Machine Learning
A Min-max Cut Algorithm for Graph Partitioning and Data Clustering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Multilevel spectral hypergraph partitioning with arbitrary vertex sizes
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
An introduction to kernel-based learning algorithms
IEEE Transactions on Neural Networks
A general model for clustering binary data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Document Clustering Using Locality Preserving Indexing
IEEE Transactions on Knowledge and Data Engineering
A Unified View on Clustering Binary Data
Machine Learning
A partitioning based algorithm to fuzzy co-cluster documents and words
Pattern Recognition Letters
A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Information Processing and Management: an International Journal
Structural and temporal analysis of the blogosphere through community factorization
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Possibilistic fuzzy co-clustering of large document collections
Pattern Recognition
Biomedical ontology improves biomedical literature clustering performance: a comparison study
International Journal of Bioinformatics Research and Applications
Utilizing phrase-similarity measures for detecting and clustering informative RSS news articles
Integrated Computer-Aided Engineering
Document Clustering Based on Spectral Clustering and Non-negative Matrix Factorization
IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
Generating Fuzzy Equivalence Classes on RSS News Articles for Retrieving Correlated Information
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Learning Bidirectional Similarity for Collaborative Filtering
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Clustering based on matrix approximation: a unifying view
Knowledge and Information Systems
Expert Systems with Applications: An International Journal
Detect and track latent factors with online nonnegative matrix factorization
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Document Clustering with Cluster Refinement and Non-negative Matrix Factorization
ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
Mining fuzzy frequent itemsets for hierarchical document clustering
Information Processing and Management: an International Journal
Document clustering using NMF and fuzzy relation
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Learning bidirectional asymmetric similarity for collaborative filtering via matrix factorization
Data Mining and Knowledge Discovery
Integrating Document Clustering and Multidocument Summarization
ACM Transactions on Knowledge Discovery from Data (TKDD)
Discriminative concept factorization for data representation
Neurocomputing
Representing document as dependency graph for document clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Improving quality of search results clustering with approximate matrix factorisations
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Locality-constrained concept factorization
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Clustering and understanding documents via discrimination information maximization
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Journal of Intelligent Information Systems
Feature selection for unsupervised learning
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Discriminative Orthogonal Nonnegative matrix factorization with flexibility for data representation
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In this paper, we propose a new data clustering method called concept factorization that models each concept as a linear combination of the data points, and each data point as a linear combination of the concepts. With this model, the data clustering task is accomplished by computing the two sets of linear coefficients, and this linear coefficients computation is carried out by finding the non-negative solution that minimizes the reconstruction error of the data points. The cluster label of each data point can be easily derived from the obtained linear coefficients. This method differs from the method of clustering based on non-negative matrix factorization (NMF) \citeXu03 in that it can be applied to data containing negative values and the method can be implemented in the kernel space. Our experimental results show that the proposed data clustering method and its variations performs best among 11 algorithms and their variations that we have evaluated on both TDT2 and Reuters-21578 corpus. In addition to its good performance, the new method also has the merit in its easy and reliable derivation of the clustering results.