Discovering links between lexical and surface features in questions and answers

Authors:
Soumen Chakrabarti
Affiliations:
IIT Bombay
Venue:
WebKDD'04 Proceedings of the 6th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
Year:
2004

Citing 16
Cited 1

Question-answering by predictive annotation

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Scaling question answering to the Web

Proceedings of the 10th international conference on World Wide Web
Learning search engine specific query transformations for question answering

Proceedings of the 10th international conference on World Wide Web
Exploiting redundancy in question answering

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic question answering on the web

Proceedings of the 11th international conference on World Wide Web
On the MSE robustness of batching estimators

Proceedings of the 33nd conference on Winter simulation
Quantitative evaluation of passage retrieval algorithms for question answering

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Robustness of regularized linear classification methods in text categorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Analyses for elucidating current question answering technology

Natural Language Engineering
Discovery of inference rules for question-answering

Natural Language Engineering
Decision lists for lexical ambiguity resolution: application to accent restoration in Spanish and French

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Web-scale information extraction in knowitall: (preliminary results)

Proceedings of the 13th international conference on World Wide Web
Is question answering an acquired skill?

Proceedings of the 13th international conference on World Wide Web
Hierarchical directed acyclic graph kernel: methods for structured natural language data

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
AnswerBus question answering system

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Efficiently inducing features of conditional random fields

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

WebKDD 2004: web mining and web usage analysis post-workshop report

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information retrieval systems, based on keyword match, are evolving to question answering systems that return short passages or direct answers to questions, rather than URLs pointing to whole pages. Most open-domain question answering systems depend on manually designed hierarchies of question types. A question is first classified to a fixed type, and then hand-engineered rules associated with the type yield keywords and/or predictive annotations that are likely to match indexed answer passages. Here we seek a more data-driven approach, assisted by machine learning. We propose a simple log-linear model over a pair of feature vectors, one derived from the question and the other derived from the a candidate passage. Features are extracted using a lexical network and surface context as in named entity extraction, except that there is no direct supervision available in the form of fixed entity types and their examples. Using the log-linear model, we filter candidate passages and see substantial improvement in the mean rank at which the first answer is found. The model parameters distill and reveal linguistic artifacts coupling questions and their answers, which can be used for better annotation and indexing.