A framework for merging and ranking of answers in DeepQA

Authors:
D. C. Gondek;A. Lally;A. Kalyanpur;J. W. Murdock;P. A. Duboue;L. Zhang;Y. Pan;Z. M. Qiu;C. Welty
Affiliations:
IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;Les Laboratoires Foulab, Montreal, Quebec, Canada;IBM Research Division, China Research Lab, Beijing, China;IBM Research Division, China Research Lab, Beijing, China;IBM Research Division, China Research Lab, Beijing, China;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY
Venue:
IBM Journal of Research and Development
Year:
2012

Citing 14
Cited 13

Original Contribution: Stacked generalization

Neural Networks
Locally Weighted Learning

Artificial Intelligence Review - Special issue on lazy learning
Adapting ranking SVM to document retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic graphical model for joint answer ranking in question answering

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive mixtures of local experts

Neural Computation
DBpedia - A crystallization point for the Web of Data

Web Semantics: Science, Services and Agents on the World Wide Web
DSLs in Action

DSLs in Action
Introduction to "This is Watson"

IBM Journal of Research and Development
Question analysis: how watson reads a clue

IBM Journal of Research and Development
Automatic knowledge extraction from documents

IBM Journal of Research and Development
Finding needles in the haystack: search and candidate generation

IBM Journal of Research and Development
Typing candidate answers using type coercion

IBM Journal of Research and Development
Textual evidence gathering and analysis

IBM Journal of Research and Development
Structured data and inference in DeepQA

IBM Journal of Research and Development

Learning to rank for robust question answering

Proceedings of the 21st ACM international conference on Information and knowledge management
Hypothesis Generation and Testing in Event Profiling for Digital Forensic Investigations

International Journal of Digital Crime and Forensics
Introduction to "This is Watson"

IBM Journal of Research and Development
Question analysis: how watson reads a clue

IBM Journal of Research and Development
Finding needles in the haystack: search and candidate generation

IBM Journal of Research and Development
Typing candidate answers using type coercion

IBM Journal of Research and Development
Textual evidence gathering and analysis

IBM Journal of Research and Development
Relation extraction and scoring in DeepQA

IBM Journal of Research and Development
Structured data and inference in DeepQA

IBM Journal of Research and Development
Special questions and techniques

IBM Journal of Research and Development
Identifying implicit relationships

IBM Journal of Research and Development
In the game: the interface between Watson and Jeopardy!

IBM Journal of Research and Development
A phased ranking model for question answering

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The final stage in the IBM DeepQA pipeline involves ranking all candidate answers according to their evidence scores and judging the likelihood that each candidate answer is correct. In DeepQA, this is done using a machine learning framework that is phase-based, providing capabilities for manipulating the data and applying machine learning in successive applications. We show how this design can be used to implement solutions to particular challenges that arise in applying machine learning for evidence-based hypothesis evaluation. Our approach facilitates an agile development environment for DeepQA; evidence scoring strategies can be easily introduced, revised, and reconfigured without the need for error-prone manual effort to determine how to combine the various evidence scores. We describe the framework, explain the challenges, and evaluate the gain over a baseline machine learning approach.