Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning from labeled and unlabeled data on a directed graph
ICML '05 Proceedings of the 22nd international conference on Machine learning
Boosting Inductive Transfer for Text Classification Using Wikipedia
ICMLA '07 Proceedings of the Sixth International Conference on Machine Learning and Applications
Learning from labeled features using generalized expectation criteria
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Building semantic kernels for text classification using wikipedia
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Wikipedia in Action: Ontological Knowledge in Text Categorization
ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Text categorization with knowledge transfer from heterogeneous data sources
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Discriminative Learning Under Covariate Shift
The Journal of Machine Learning Research
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Semi-Supervised Learning
TAGME: on-the-fly annotation of short text fragments (by wikipedia entities)
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Fast learning for sentiment analysis on bullying
Proceedings of the First International Workshop on Issues of Sentiment Discovery and Opinion Mining
Novel document detection for massive data streams using distributed dictionary learning
IBM Journal of Research and Development
Hi-index | 0.00 |
The rapid construction of supervised text classification models is becoming a pervasive need across many modern applications. To reduce human-labeling bottlenecks, many new statistical paradigms (e.g., active, semi-supervised, transfer and multi-task learning) have been vigorously pursued in recent literature with varying degrees of empirical success. Concurrently, the emergence of Web 2.0 platforms in the last decade has enabled a world-wide, collaborative human effort to construct a massive ontology of concepts with very rich, detailed and accurate descriptions. In this paper we propose a new framework to extract supervisory information from such ontologies and complement it with a shift in human effort from direct labeling of examples in the domain of interest to the much more efficient identification of concept-class associations. Through empirical studies on text categorization problems using the Wikipedia ontology, we show that this shift allows very high-quality models to be immediately induced at virtually no cost.