Extracting Enterprise Vocabularies Using Linked Open Data

  • Authors:
  • Julian Dolby;Achille Fokoue;Aditya Kalyanpur;Edith Schonberg;Kavitha Srinivas

  • Affiliations:
  • IBM Watson Research Center, Yorktown Heights, USA 10598;IBM Watson Research Center, Yorktown Heights, USA 10598;IBM Watson Research Center, Yorktown Heights, USA 10598;IBM Watson Research Center, Yorktown Heights, USA 10598;IBM Watson Research Center, Yorktown Heights, USA 10598

  • Venue:
  • ISWC '09 Proceedings of the 8th International Semantic Web Conference
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A common vocabulary is vital to smooth business operation, yet codifying and maintaining an enterprise vocabulary is an arduous, manual task. We describe a process to automatically extract a domain specific vocabulary (terms and types) from unstructured data in the enterprise guided by term definitions in Linked Open Data (LOD). We validate our techniques by applying them to the IT (Information Technology) domain, taking 58 Gartner analyst reports and using two specific LOD sources --- DBpedia and Freebase. We show initial findings that address the generalizability of these techniques for vocabulary extraction in new domains, such as the energy industry.