A support vector machine-based context-ranking model for question answering

  • Authors:
  • Show-Jane Yen;Yu-Chieh Wu;Jie-Chi Yang;Yue-Shi Lee;Chung-Jung Lee;Jui-Jung Liu

  • Affiliations:
  • Department of Computer Science and Information Engineering, Ming Chuan University, No. 5, De-Ming Rd., Gweishan District, Taoyuan 333, Taiwan, ROC;Department of Communication and Management, Ming Chuan University, No. 250, Zhong Shan N. Rd., Taipei 111, Taiwan, ROC;Graduate Institute of Network Learning Technology, National Central University, No. 300, Jhong-Da Rd., Jhongli City, Taoyuan County 320, Taiwan, ROC;Department of Computer Science and Information Engineering, Ming Chuan University, No. 5, De-Ming Rd., Gweishan District, Taoyuan 333, Taiwan, ROC;Department of Finance, Ming Chuan University, No. 250, Zhong Shan N. Rd., Taipei 111, Taiwan, ROC;Department of Information and Electronic Commerce, Kai-Nan University, No. 1, Kainan Rd., Luzhu Shiang, Taoyuan 338, Taiwan, ROC

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 0.07

Visualization

Abstract

Modern information technologies and Internet services are suffering from the problem of selecting and managing a growing amount of textual information, to which access is often critical. Machine learning techniques have recently shown excellent performance and flexibility in many applications, such as artificial intelligence and pattern recognition. Question answering (QA) is a method of locating exact answer sentences from vast document collections. This paper presents a machine learning-based question-answering framework, which integrates a question classifier, simple document/passage retrievers, and the proposed context-ranking models. The question classifier is trained to categorize the answer type of the given question and instructs the context-ranking model to re-rank the passages retrieved from the initial retrievers. This method provides flexible features to learners, such as word forms, syntactic features, and semantic word features. The proposed context-ranking model, which is based on the sequential labeling of tasks, combines rich features to predict whether the input passage is relevant to the question type. We employ TREC-QA tracks and question classification benchmarks to evaluate the proposed method. The experimental results show that the question classifier achieves 85.60% accuracy without any additional semantic or syntactic taggers, and reached 88.60% after we employed the proposed term expansion techniques and a predefined related-word set. In the TREC-10 QA task, by using the gold TREC-provided relevant document set, the QA model achieves a 0.563 mean reciprocal rank (MRR) score, and a 0.342 MRR score is achieved after using the simple document and passage retrieval algorithms.