Research on text categorization based on a weakly-supervised transfer learning method

Authors:
Dequan Zheng;Chenghe Zhang;Geli Fei;Tiejun Zhao
Affiliations:
MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin, China;MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin, China;MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin, China;MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin, China
Venue:
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Year:
2012

Citing 14
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Clustering Association Rules

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Centroid-Based Document Classification: Analysis and Experimental Results

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
A novel refinement approach for text categorization

Proceedings of the 14th ACM international conference on Information and knowledge management
Co-clustering based classification for out-of-domain documents

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Can chinese web pages be classified with english data source?

Proceedings of the 17th international conference on World Wide Web
Self-taught clustering

Proceedings of the 25th international conference on Machine learning
Topic-bridged PLSA for cross-domain text classification

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Spectral domain-transfer learning

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A Survey on Transfer Learning

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a weakly-supervised transfer learning based text categorization method, which does not need to tag new training documents when facing classification tasks in new area. Instead, we can take use of the already tagged documents in other domains to accomplish the automatic categorization task. By extracting linguistic information such as part-of-speech, semantic, co-occurrence of keywords, we construct a domain-adaptive transfer knowledge base. Relation experiments show that, the presented method improved the performance of text categorization on traditional corpus, and our results were only about 5% lower than the baseline on cross-domain classification tasks. And thus we demonstrate the effectiveness of our method.