Relevance as an aid to evaluation in OPACs
Journal of Information Science
A limited memory algorithm for bound constrained optimization
SIAM Journal on Scientific Computing
Journal of the American Society for Information Science - Special topic issue on the history of documentation and information science: part II
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Variational Extensions to EM and Multinomial PCA
ECML '02 Proceedings of the 13th European Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
An introduction to variable and feature selection
The Journal of Machine Learning Research
Non-negative Matrix Factorization with Sparseness Constraints
The Journal of Machine Learning Research
Relation between PLSA and NMF and implications
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning Sparse Representations by Non-Negative Matrix Factorization and Sequential Cone Programming
The Journal of Machine Learning Research
Topic Detection in Online Discussion Using Non-negative Matrix Factorization
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Incorporating User Provided Constraints into Document Clustering
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Non-negative matrix factorization for semi-supervised data clustering
Knowledge and Information Systems
Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds
IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Document clustering based on nonnegative sparse matrix factorization
ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part II
Non-negative matrix factorization based text mining: feature extraction and classification
ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Monaural music source separation: nonnegativity, sparseness, and shift-invariance
ICA'06 Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation
Deriving TF-IDF as a fisher kernel
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Hi-index | 0.08 |
Discovering structure within a collection of high-dimensional input vectors is a problem that often recurs in the area of machine learning. A very suitable and widely used algorithm for solving such tasks is Non-negative Matrix Factorization (NMF). The high-dimensional vectors are arranged as columns in a data matrix, which is decomposed into two non-negative matrix factors of much lower rank. Here, we adopt the NMF learning scheme proposed by Van hamme (2008) [1]. It involves combining the training data with supervisory data, which imposes the low-dimensional structure known to be present. The reconstruction of such supervisory data on previously unseen inputs then reveals their underlying structure in an explicit way. It has been noted that for many problems, not all features of the training data correlate equally well with the underlying structure. In other words, some features are relevant for detecting patterns in the data, while others are not. In this paper, we propose an algorithm that builds upon the learning scheme of Van hamme (2008) [1], and automatically weights each input feature according to its relevance. Applications include both data improvement and feature selection. We experimentally show that our algorithm outperforms similar techniques on both counts.