Exploiting unlabeled data for question classification

Authors:
David Tomás;Claudio Giuliano
Affiliations:
Department of Software and Computing Systems, University of Alicante, Spain;Human Language Technology Group, FBK-Irst, Italy
Venue:
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Year:
2011

Citing 5
Cited 0

An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Learning question classifiers

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Question classification with support vector machines and error correcting codes

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we introduce a kernel-based approach to question classification. We employed a kernel function based on latent semantic information acquired from Wikipedia. This kernel allows including external semantic knowledge into the supervised learning process.We obtained a highly effective question classifier combining this knowledge with a bag-of-words approach by means of composite kernels. As the semantic information is acquired from unlabeled text, our system can be easily adapted to different languages and domains. We tested it on a parallel corpus of English and Spanish questions.