Multilingual ontologies for cross-language information extraction and semantic search

  • Authors:
  • David W. Embley;Stephen W. Liddle;Deryle W. Lonsdale;Yuri Tijerino

  • Affiliations:
  • Department of Computer Science, Brigham Young University, Provo, Utah;Information Systems Department, Brigham Young University, Provo, Utah;Department of Linguistics and English Language, Brigham Young University, Provo, Utah;Department of Applied Informatics, Kwansei Gakuin University, Kobe-Sanda, Japan

  • Venue:
  • ER'11 Proceedings of the 30th international conference on Conceptual modeling
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Valuable local information is often available on the web, but encoded in a foreign language that non-local users do not understand. Can we create a system to allow a user to query in language L1 for facts in a web page written in language L2? We propose a suite of multilingual extraction ontologies as a solution to this problem. We ground extraction ontologies in each language of interest, and we map both the data and the metadata among the language-specific extraction ontologies. The mappings are through a central, language-agnostic ontology that allows new languages to be added by only having to provide one mapping rather than one for each language pair. Results from an implemented early prototype demonstrate the feasibility of cross-language information extraction and semantic search. Further, results from an experimental evaluation of ontology-based query translation and extraction accuracy are remarkably good given the complexity of the problem and the complications of its implementation.