Learning to rank for why-question answering

Authors:
Suzan Verberne;Hans Halteren;Daphne Theijssen;Stephan Raaijmakers;Lou Boves
Affiliations:
Centre for Language and Speech Technology, Radboud University, Nijmegen, Netherlands;Centre for Language and Speech Technology, Radboud University, Nijmegen, Netherlands;Department of Linguistics, Radboud University, Nijmegen, Netherlands;TNO Information and Communication Technology, Delft, Netherlands;Centre for Language and Speech Technology, Radboud University, Nijmegen, Netherlands
Venue:
Information Retrieval
Year:
2011

Citing 29
Cited 4

Building a question answering test collection

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Genetic Algorithms and Machine Learning

Machine Learning
An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
An artificial intelligence approach to information retrieval (abstract only)

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The effects of fitness functions on genetic programming-based ranking discovery for Web search: Research Articles

Journal of the American Society for Information Science and Technology
Ranking and Reranking with Perceptron

Machine Learning
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
The Wikipedia XML corpus

ACM SIGIR Forum
Learning a ranking from pairwise preferences

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Infinitely Imbalanced Logistic Regression

The Journal of Machine Learning Research
Learning to rank: from pairwise approach to listwise approach

Proceedings of the 24th international conference on Machine learning
A support vector method for optimizing average precision

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating discourse-based answer extraction for why-question answering

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A question/answer typology with surface text patterns

HLT '02 Proceedings of the second international conference on Human Language Technology Research
The class imbalance problem: A systematic study

Intelligent Data Analysis
Listwise approach to learning to rank: theory and algorithm

Proceedings of the 25th international conference on Machine learning
Structured learning for non-smooth ranking losses

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A Comparison of Genetic Algorithms for Optimizing Linguistically Informed IR in Question Answering

AI*IA '07 Proceedings of the 10th Congress of the Italian Association for Artificial Intelligence on AI*IA 2007: Artificial Intelligence and Human-Oriented Computing
Document selection methodologies for efficient and effective learning-to-rank

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
WordNet: similarity - measuring the relatedness of concepts

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Using syntactic information for improving why-question answering

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Learning to Rank for Information Retrieval

Foundations and Trends in Information Retrieval
SVMs modeling for highly imbalanced classification

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
What is not in the bag of words for why-qa?

Computational Linguistics
Subset ranking using regression

COLT'06 Proceedings of the 19th annual conference on Learning Theory

Bringing why-QA to web search

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Learning to rank for robust question answering

Proceedings of the 21st ACM international conference on Information and knowledge management
Evolutionary optimization for ranking how-to questions based on user-generated contents

Expert Systems with Applications: An International Journal
Combining pre-retrieval query quality predictors using genetic programming

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we evaluate a number of machine learning techniques for the task of ranking answers to why-questions. We use TF-IDF together with a set of 36 linguistically motivated features that characterize questions and answers. We experiment with a number of machine learning techniques (among which several classifiers and regression techniques, Ranking SVM and SVM map ) in various settings. The purpose of the experiments is to assess how the different machine learning approaches can cope with our highly imbalanced binary relevance data, with and without hyperparameter tuning. We find that with all machine learning techniques, we can obtain an MRR score that is significantly above the TF-IDF baseline of 0.25 and not significantly lower than the best score of 0.35. We provide an in-depth analysis of the effect of data imbalance and hyperparameter tuning, and we relate our findings to previous research on learning to rank for Information Retrieval.