Concept decompositions for large sparse text data using clustering
Machine Learning
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Modern Information Retrieval
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
ReCoM: reinforcement clustering of multi-type interrelated data objects
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Non-negative Matrix Factorization with Sparseness Constraints
The Journal of Machine Learning Research
Relation between PLSA and NMF and implications
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Orthogonal nonnegative matrix t-factorizations for clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
IGroup: web image search results clustering
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Enhancing clustering blog documents by utilizing author/reader comments
ACM-SE 45 Proceedings of the 45th annual southeast regional conference
SVD based initialization: A head start for nonnegative matrix factorization
Pattern Recognition
Comments-oriented document summarization: understanding documents with readers' feedback
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Proceedings of the 18th international conference on World wide web
A survey of Web clustering engines
ACM Computing Surveys (CSUR)
Multi-view clustering via canonical correlation analysis
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Multiview clustering: a late fusion approach using latent models
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A Matrix Factorization Approach for Integrating Multiple Data Views
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Natural Language Processing with Python
Natural Language Processing with Python
Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce
Proceedings of the 19th international conference on World wide web
Hierarchical comments-based clustering
Proceedings of the 2011 ACM Symposium on Applied Computing
Mining tags using social endorsement networks
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Improved video categorization from text metadata and user comments
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Slovak Blog Clustering Enhanced by Mining the Web Comments
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Simultaneous joint and conditional modeling of documents tagged from two perspectives
Proceedings of the 20th ACM international conference on Information and knowledge management
Scikit-learn: Machine Learning in Python
The Journal of Machine Learning Research
NIMFA: a python library for nonnegative matrix factorization
The Journal of Machine Learning Research
Nonnegative Matrix Factorization: A Comprehensive Review
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
Clustering Web 2.0 items (i.e., web resources like videos, images) into semantic groups benefits many applications, such as organizing items, generating meaningful tags and improving web search. In this paper, we systematically investigate how user-generated comments can be used to improve the clustering of Web 2.0 items. In our preliminary study of Last.fm, we find that the two data sources extracted from user comments -- the textual comments and the commenting users -- provide complementary evidence to the items' intrinsic features. These sources have varying levels of quality, but we importantly we find that incorporating all three sources improves clustering. To accommodate such quality imbalance, we invoke multi-view clustering, in which each data source represents a view, aiming to best leverage the utility of different views. To combine multiple views under a principled framework, we propose CoNMF (Co-regularized Non-negative Matrix Factorization), which extends NMF for multi-view clustering by jointly factorizing the multiple matrices through co-regularization. Under our CoNMF framework, we devise two paradigms -- pair-wise CoNMF and cluster-wise CoNMF -- and propose iterative algorithms for their joint factorization. Experimental results on Last.fm and Yelp datasets demonstrate the effectiveness of our solution. In Last.fm, CoNMF betters k-means with a statistically significant F1 increase of 14%, while achieving comparable performance with the state-of-the-art multi-view clustering method CoSC (Co-regularized Spectral Clustering). On a Yelp dataset, CoNMF outperforms the best baseline CoSC with a statistically significant performance gain of 7%.