Classifying chinese texts in two steps

Authors:
Xinghua Fan;Maosong Sun;Key-sun Choi;Qin Zhang
Affiliations:
State Key Laboratory of Intelligent Technology and Systems, Tsinghua University, Beijing, China;State Key Laboratory of Intelligent Technology and Systems, Tsinghua University, Beijing, China;Computer Science Division, Korterm, KAIST, Daejeon, Korea;State Intellectual Property Office of P.R. China, Beijing, China
Venue:
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Year:
2005

Citing 11
Cited 2

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Combining automatic and manual index representations in probabilistic retrieval

Journal of the American Society for Information Science
Method combination for document filtering

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Combining classifiers in text categorization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A meta-learning approach for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Probabilistic combination of text classifiers using reliability indicators: models and results

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Combining Multiple Learning Strategies for Effective Cross Validation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning

A high performance prototype system for chinese text categorization

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Combining bi-gram of character and word to classify two-class chinese texts in two steps

RSCTC'06 Proceedings of the 5th international conference on Rough Sets and Current Trends in Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a two-step method for Chinese text categorization (TC). In the first step, a Naïve Bayesian classifier is used to fix the fuzzy area between two categories, and, in the second step, the classifier with more subtle and powerful features is used to deal with documents in the fuzzy area, which are thought of being unreliable in the first step. The preliminary experiment validated the soundness of this method. Then, the method is extended from two-class TC to multi-class TC. In this two-step framework, we try to further improve the classifier by taking the dependences among features into consideration in the second step, resulting in a Causality Naïve Bayesian Classifier.