SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to classify text from labeled and unlabeled documents
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Similarity-based word sense disambiguation
Computational Linguistics - Special issue on word sense disambiguation
Document classification using a finite mixture model
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Improving text categorization using the importance of sentences
Information Processing and Management: an International Journal
Using the feature projection technique based on a normalized voting method for text classification
Information Processing and Management: an International Journal
Text categorization using feature projections
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Automatic text categorization using the importance of sentences
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Journal of Systems Architecture: the EUROMICRO Journal
Higher order feature selection for text classification
Knowledge and Information Systems
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Investigating unsupervised learning for text categorization bootstrapping
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Text classification from unlabeled documents with bootstrapping and feature projection techniques
Information Processing and Management: an International Journal
Effects of Term Distributions on Binary Classification
IEICE - Transactions on Information and Systems
Improving text categorization bootstrapping via unsupervised learning
ACM Transactions on Speech and Language Processing (TSLP)
Fully Automatic Text Categorization by Exploiting WordNet
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Improving text classification with concept index terms and expansion terms
ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part III
Towards the taxonomy-oriented categorization of yellow pages queries
ACM Transactions on Internet Technology (TOIT)
Text categorization using SVMs with rocchio ensemble for internet information classification
ICCNMC'05 Proceedings of the Third international conference on Networking and Mobile Computing
Automatic word clustering for text categorization using global information
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Hi-index | 0.00 |
The goal of text categorization is to classify documents into a certain number of predefined categories. The previous works in this area have used a large number of labeled training documents for supervised learning. One problem is that it is difficult to create the labeled training documents. While it is easy to collect the unlabeled documents, it is not so easy to manually categorize them for creating training documents. In this paper, we propose an unsupervised learning method to overcome these difficulties. The proposed method divides the documents into sentences, and categorizes each sentence using keyword lists of each category and sentence similarity measure. And then, it uses the categorized sentences for training. The proposed method shows a similar degree of performance, compared with the traditional supervised learning methods. Therefore, this method can be used in areas where low-cost text categorization is needed. It also can be used for creating training documents.