A machine learning approach for Indonesian question answering system

  • Authors:
  • Ayu Purwarianti;Masatoshi Tsuchiya;Seiichi Nakagawa

  • Affiliations:
  • Department of Information and Computer Science, Toyohashi University of Technology, Toyohashi-shi, Aichi, Japan;Information and Media Center, Toyohashi University of Technology, Toyohashi-shi, Aichi, Japan;Department of Information and Computer Science, Toyohashi University of Technology, Toyohashi-shi, Aichi, Japan

  • Venue:
  • AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Our research is to investigate a machine learning approach in order to build an Indonesian Question Answering System. Based on our experiments result on the question classification task, we choose to use SVM as the machine learning algorithm. Similar with ordinary QA systems, we divide our system into three subcomponents: question classifier, passage retriever and answer finder. The SVM algorithm is employed in the question classifier and answer finder modules. To overcome the language resource poorness problem of Indonesian language, we introduce a bi-gram frequency attribute extracted from a downloaded newspaper corpus. The comparison among attribute combination is shown in our question classifier experiment. The t-test shows that the question shallow parser result attribute joined with bi-gram frequency attribute gives significant improvement compared to the baseline (bag of words). Our question classifier achieves 96% accuracy. We also compare some attribute combinations in the answer finder module. We find that the join attribute between the expected answer type (EAT) and the attributes of the question classifier gives higher MRR score than using only the EAT attribute or only the attribute of the question classifiers. Our QA system achieves MRR (Mean Reciprocal Rank) of 0.52 for exact answers.