From Glossaries to Ontologies: Extracting Semantic Structure from Textual Definitions

  • Authors:
  • Roberto Navigli;Paola Velardi

  • Affiliations:
  • Università di Roma “La Sapienza”, Roma, Italy;Università di Roma “La Sapienza”, Roma, Italy

  • Venue:
  • Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Learning ontologies requires the acquisition of relevant domain concepts and taxonomic, as well as non-taxonomic, relations. In this chapter, we present a methodology for automatic ontology enrichment and document annotation with concepts and relations of an existing domain core ontology. Natural language definitions from available glossaries in a given domain are processed and regular expressions are applied to identify general-purpose and domain-specific relations. We evaluate the methodology performance in extracting hypernymy and non-taxonomic relations. To this end, we annotated and formalized a relevant fragment of the glossary of Art and Architecture (AAT) with a set of 10 relations (plus the hypernymy relation) defined in the CRM CIDOC cultural heritage core ontology, a recent W3C standard. Finally, we assessed the generality of the approach on a set of web pages from the domains of history and biography.