Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Automatic text categorization by unsupervised learning
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Text classification by labeling words
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Improving text categorization bootstrapping via unsupervised learning
ACM Transactions on Speech and Language Processing (TSLP)
Text categorization from category name via lexical reference
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Fully Automatic Text Categorization by Exploiting WordNet
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Towards the taxonomy-oriented categorization of yellow pages queries
ACM Transactions on Internet Technology (TOIT)
Sentence clustering via projection over term clusters
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Interest-matching information propagation in multiple online social networks
Proceedings of the 21st ACM international conference on Information and knowledge management
Classifying unlabeled short texts using a fuzzy declarative approach
Language Resources and Evaluation
Hi-index | 0.00 |
We propose a generalized bootstrapping algorithm in which categories are described by relevant seed features. Our method introduces two unsupervised steps that improve the initial categorization step of the bootstrapping scheme: (i) using Latent Semantic space to obtain a generalized similarity measure between instances and features, and (ii) the Gaussian Mixture algorithm, to obtain uniform classification probabilities for unlabeled examples. The algorithm was evaluated on two Text Categorization tasks and obtained state-of-the-art performance using only the category names as initial seeds.