Dimensionality reduction with category information fusion and non-negative matrix factorization for text categorization

Authors:
Wenbin Zheng;Yuntao Qian;Hong Tang
Affiliations:
College of Computer Science and Technology, Zhejiang University, Hangzhou, China and College of Information Engineering, China Jiliang University, Hangzhou, China;College of Computer Science and Technology, Zhejiang University, Hangzhou, China;School of Aeronautics and Astronautics, Zhejiang University, Hangzhou, China and College of Metrological Technology & Engineering, China Jiliang University, Hangzhou, China
Venue:
AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part III
Year:
2011

Citing 11
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Introducing a weighted non-negative matrix factorization for image classification

Pattern Recognition Letters
Non-negative Matrix Factorization with Sparseness Constraints

The Journal of Machine Learning Research
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Reducing microarray data via nonnegative matrix factorization for visualization and clustering analysis

Journal of Biomedical Informatics
Text categorization based on topic model

RSKT'08 Proceedings of the 3rd international conference on Rough sets and knowledge technology
Knowledge extraction with non-negative matrix factorization for text classification

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
A comparative study of TF*IDF, LSI and multi-words for text classification

Expert Systems with Applications: An International Journal
Aggressive dimensionality reduction with reinforcement local feature selection for text categorization

AICI'10 Proceedings of the 2010 international conference on Artificial intelligence and computational intelligence: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Dimensionality reduction can efficiently improve computing performance of classifiers in text categorization, and non-negative matrix factorization could map the high dimensional term space into a low dimensional semantic subspace easily. Meanwhile, the non-negative of the basis vectors could provide a meaningful explanation for the semantic subspace. However, it usually could not achieve a satisfied classification performance because it is sensitive to the noise, data missing and outlier as a linear reconstruction method. This paper proposes a novel approach in which the train text and its category information are fused and a transformation matrix that maps the term space into a semantic subspace is obtained by a basis orthogonality non-negative matrix factorization and truncation. Finally, the dimensionality can be reduced aggressively with these transformations. Experimental results show that the proposed approach remains a good classification performance in a very low dimensional case.