Structured data and inference in DeepQA

Authors:
A. Kalyanpur;B. K. Boguraev;S. Patwardhan;J. W. Murdock;A. Lally;C. Welty;J. M. Prager;B. Coppola;A. Fokoue-Nkoutche;L. Zhang;Y. Pan;Z. M. Qiu
Affiliations:
IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Research Division, China Research Lab, Beijing, China;IBM Research Division, China Research Lab, Beijing, China;IBM Research Division, China Research Lab, Beijing, China
Venue:
IBM Journal of Research and Development
Year:
2012

Citing 20
Cited 10

WordNet: a lexical database for English

Communications of the ACM
Omnibase: Uniform Access to Heterogeneous Data for Question Answering

NLDB '02 Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers
Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema

ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
Minsky's frame system theory

TINLAP '75 Proceedings of the 1975 workshop on Theoretical issues in natural language processing
The Berkeley FrameNet Project

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Yago: a core of semantic knowledge

Proceedings of the 16th international conference on World Wide Web
Freebase: a collaboratively created graph database for structuring human knowledge

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Baseball: an automatic question-answerer

IRE-AIEE-ACM '61 (Western) Papers presented at the May 9-11, 1961, western joint IRE-AIEE-ACM computer conference
FrameNet-Based Fact-Seeking Answer Processing: A Study of Semantic Alignment Techniques and Lexical Coverage

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
DBpedia - A crystallization point for the Web of Data

Web Semantics: Science, Services and Agents on the World Wide Web
Enhancing the open-domain classification of named entity using linked open data

ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
Introduction to "This is Watson"

IBM Journal of Research and Development
Question analysis: how watson reads a clue

IBM Journal of Research and Development
Deep parsing in Watson

IBM Journal of Research and Development
Automatic knowledge extraction from documents

IBM Journal of Research and Development
Finding needles in the haystack: search and candidate generation

IBM Journal of Research and Development
Typing candidate answers using type coercion

IBM Journal of Research and Development
Textual evidence gathering and analysis

IBM Journal of Research and Development
Relation extraction and scoring in DeepQA

IBM Journal of Research and Development
A framework for merging and ranking of answers in DeepQA

IBM Journal of Research and Development

Introduction to "This is Watson"

IBM Journal of Research and Development
Question analysis: how watson reads a clue

IBM Journal of Research and Development
Deep parsing in Watson

IBM Journal of Research and Development
Textual resource acquisition and engineering

IBM Journal of Research and Development
Finding needles in the haystack: search and candidate generation

IBM Journal of Research and Development
Typing candidate answers using type coercion

IBM Journal of Research and Development
Relation extraction and scoring in DeepQA

IBM Journal of Research and Development
Special questions and techniques

IBM Journal of Research and Development
Identifying implicit relationships

IBM Journal of Research and Development
A framework for merging and ranking of answers in DeepQA

IBM Journal of Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although the majority of evidence analysis in DeepQA is focused on unstructured information (e.g., natural-language documents), several components in the DeepQA system use structured data (e.g., databases, knowledge bases, and ontologies) to generate potential candidate answers or find additional evidence. Structured data analytics are a natural complement to unstructured methods in that they typically cover a narrower range of questions but are more precise within that range. Moreover, structured data that has formal semantics is amenable to logical reasoning techniques that can be used to provide implicit evidence. The DeepQA system does not contain a single monolithic structured data module; instead, it allows for different components to use and integrate structured and semistructured data, with varying degrees of expressivity and formal specificity. This paper is a survey of DeepQA components that use structured data. Areas in which evidence from structured sources has the most impact include typing of answers, application of geospatial and temporal constraints, and the use of formally encoded a priori knowledge of commonly appearing entity types such as countries and U.S. presidents. We present details of appropriate components and demonstrate their end-to-end impact on the IBM Watsoni system.