Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Centroid-Based Document Classification: Analysis and Experimental Results
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
A novel refinement approach for text categorization
Proceedings of the 14th ACM international conference on Information and knowledge management
Co-clustering based classification for out-of-domain documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Can chinese web pages be classified with english data source?
Proceedings of the 17th international conference on World Wide Web
Proceedings of the 25th international conference on Machine learning
Topic-bridged PLSA for cross-domain text classification
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Spectral domain-transfer learning
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Domain adaptation with structural correspondence learning
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
This paper presents a weakly-supervised transfer learning based text categorization method, which does not need to tag new training documents when facing classification tasks in new area. Instead, we can take use of the already tagged documents in other domains to accomplish the automatic categorization task. By extracting linguistic information such as part-of-speech, semantic, co-occurrence of keywords, we construct a domain-adaptive transfer knowledge base. Relation experiments show that, the presented method improved the performance of text categorization on traditional corpus, and our results were only about 5% lower than the baseline on cross-domain classification tasks. And thus we demonstrate the effectiveness of our method.