Using semi-supervised learning for question classification

  • Authors:
  • Nguyen Thanh Tri;Nguyen Minh Le;Akira Shimazu

  • Affiliations:
  • School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan;School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan;School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan

  • Venue:
  • ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper tries to use unlabelled in combination with labelled questions for semi-supervised learning to improve the performance of question classification task. We also give two proposals to modify the Tri-training which is a simple but efficient co-training style algorithm to make it more suitable for question data type. In order to avoid bootstrap-sampling the training set to get different sets for training the three classifiers, the first proposal is to use multiple algorithms for classifiers in Tri-training, the second one is to use multiple algorithms for classifiers in combination with multiple views. The modification prevents the error rate at the initial step from being increased and our experiments show promising results.