A reinforcement learning framework for answering complex questions

Authors:
Yllias Chali;Sadid A. Hasan;Kaisar Imam
Affiliations:
University of Lethbridge, Lethbridge, AB, Canada;University of Lethbridge, Lethbridge, AB, Canada;University of Lethbridge, Lethbridge, AB, Canada
Venue:
Proceedings of the 16th international conference on Intelligent user interfaces
Year:
2011

Citing 7
Cited 1

The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Making large-scale support vector machine learning practical

Advances in kernel methods
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Complex question answering: unsupervised learning approaches and experiments

Journal of Artificial Intelligence Research
Reinforcement learning for mapping instructions to actions

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1

Improving the performance of the reinforcement learning model for answering complex questions

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scoring sentences in documents given abstract summaries created by humans is important in extractive multi-document summarization. In this paper, we use extractive multi-document summarization techniques to perform complex question answering and formulate it as a reinforcement learning problem. We use a reward function that measures the relatedness of the candidate (machine generated) summary sentences with abstract summaries. In the training stage, the learner iteratively selects original document sentences to be included in the candidate summary, analyzes the reward function and updates the related feature weights accordingly. The final weights found in this phase are used to generate summaries as answers to complex questions given unseen test data. We use a modified linear, gradient-descent version of Watkins' Q(») algorithm with µ-greedy policy to determine the best possible action i.e. selecting the most important sentences. We compare the performance of this system with a Support Vector Machine (SVM) based system. Evaluation results show that the reinforcement method advances the SVM system improving the ROUGE scores by