Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Introducing a weighted non-negative matrix factorization for image classification
Pattern Recognition Letters
Non-negative Matrix Factorization with Sparseness Constraints
The Journal of Machine Learning Research
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Journal of Biomedical Informatics
Text categorization based on topic model
RSKT'08 Proceedings of the 3rd international conference on Rough sets and knowledge technology
Knowledge extraction with non-negative matrix factorization for text classification
IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
A comparative study of TF*IDF, LSI and multi-words for text classification
Expert Systems with Applications: An International Journal
AICI'10 Proceedings of the 2010 international conference on Artificial intelligence and computational intelligence: Part I
Hi-index | 0.00 |
Dimensionality reduction can efficiently improve computing performance of classifiers in text categorization, and non-negative matrix factorization could map the high dimensional term space into a low dimensional semantic subspace easily. Meanwhile, the non-negative of the basis vectors could provide a meaningful explanation for the semantic subspace. However, it usually could not achieve a satisfied classification performance because it is sensitive to the noise, data missing and outlier as a linear reconstruction method. This paper proposes a novel approach in which the train text and its category information are fused and a transformation matrix that maps the term space into a semantic subspace is obtained by a basis orthogonality non-negative matrix factorization and truncation. Finally, the dimensionality can be reduced aggressively with these transformations. Experimental results show that the proposed approach remains a good classification performance in a very low dimensional case.