Domain kernels for text categorization

Authors:
Alfio Gliozzo;Carlo Strapparava
Affiliations:
ITC-Irst, Trento, Italy;ITC-Irst, Trento, Italy
Venue:
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Year:
2005

Citing 13
Cited 12

Generalized vector spaces model in information retrieval

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Making large-scale support vector machine learning practical

Advances in kernel methods
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Using LSI for text classification in the presence of background text

Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Distributional word clusters vs. words for text categorization

The Journal of Machine Learning Research
The role of domain information in Word Sense Disambiguation

Natural Language Engineering
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis

Learning to identify emotions in text

Proceedings of the 2008 ACM symposium on Applied computing
Words Not Cast in Stone

AI*IA '07 Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence on AI*IA 2007: Artificial Intelligence and Human-Oriented Computing
Affective Text Variation and Animation for Dynamic Advertisement

ACII '07 Proceedings of the 2nd international conference on Affective Computing and Intelligent Interaction
Semi-structured document categorization with a semantic kernel

Pattern Recognition
Applied Computational Humor and Prospects for Advertising

Proceedings of the 2006 conference on Rob Milne: A Tribute to a Pioneering AI Scientist, Entrepreneur and Mountaineer
Fine-grained classification of named entities exploiting latent semantic kernels

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Improving text categorization bootstrapping via unsupervised learning

ACM Transactions on Speech and Language Processing (TSLP)
Towards unsupervised recognition of dialogue acts

SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Dances with words

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A multiclass/multilabel document categorization system: Combining multiple classifiers in a reduced dimension

Applied Soft Computing
Lexical resources and semantic similarity for affective evaluative expressions generation

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Integration of Literature with Heterogeneous Information for Genes Correlation Scoring

ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we propose and evaluate a technique to perform semi-supervised learning for Text Categorization. In particular we defined a kernel function, namely the Domain Kernel, that allowed us to plug "external knowledge" into the supervised learning process. External knowledge is acquired from unlabeled data in a totally unsupervised way, and it is represented by means of Domain Models. We evaluated the Domain Kernel in two standard benchmarks for Text Categorization with good results, and we compared its performance with a kernel function that exploits a standard bag-of-words feature representation. The learning curves show that the Domain Kernel allows us to reduce drastically the amount of training data required for learning.