The nature of statistical learning theory
The nature of statistical learning theory
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Using WordNet and Lexical Operators to Improve Internet Searches
IEEE Internet Computing
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
A Comparison of Word- and Sense-Based Text Categorization Using Several Classification Algorithms
Journal of Intelligent Information Systems
Introduction to the special issue on word sense disambiguation: the state of the art
Computational Linguistics - Special issue on word sense disambiguation
Automatic text categorization by unsupervised learning
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Investigating unsupervised learning for text categorization bootstrapping
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A characterization of wordnet features in Boolean models for text classification
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Text classification by labeling words
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
A semantic term weighting scheme for text categorization
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
This paper proposes a Fully Automatic Categorization approach for Text (FACT) by exploiting the semantic features from WordNet and document clustering. In FACT, the training data is constructed automatically by using the knowledge of the category name. With the support of WordNet, it first uses the category name to generate a set of features for the corresponding category. Then, a set of documents is labeled according to such features. To reduce the possible bias originating from the category name and generated features, document clustering is used to refine the quality of initial labeling. The training data are subsequently constructed to train the discriminative classifier. The empirical experiments show that the best performance of FACT can achieve more than 90% of the baseline SVM classifiers in F1 measure, which demonstrates the effectiveness of the proposed approach.