A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Discriminant Waveletfaces and Nearest Feature Classifiers for Face Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Detecting Concept Drift with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The Journal of Machine Learning Research
Classification of TV Sports News by DCT Features Using Multiple Subspace Method
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
A novel refinement approach for text categorization
Proceedings of the 14th ACM international conference on Information and knowledge management
Text categorization via generalized discriminant analysis
Information Processing and Management: an International Journal
A class-feature-centroid classifier for text categorization
Proceedings of the 18th international conference on World wide web
A New Kernel-Based Classification Algorithm
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Credit risk evaluation with kernel-based affine subspace nearest points learning method
Expert Systems with Applications: An International Journal
Subspace Distance-Based Sampling Method for SVM
ICDMW '10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops
Multinomial naive bayes for text categorization revisited
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
In this paper, a text document categorization method called Theme Word Subspace (TWS) learning is presented, which utilizes theme words jointly express class-semantic information for document classification. In a class corpus, the theme words with high probability distribution in topic structure are extracted firstly, and then these words as important theme element span class subspaces to jointly represent semantic and distribution of the class. For document categorization processing, a text document is belonged to the nearest subspace whose theme words have the best representation for test document. In our TWS, L1, L2 norm are separately used for measuring the distances of a test document to subspaces. Experiments on a large Chinese text corpus, the proposed TWS learning methods exhibit comparable performances for text document category.