Brief communication: Retrieving definitional content for ontology development

  • Authors:
  • L. Smith;W. J. Wilbur

  • Affiliations:
  • National Center for Biotechnology Information, NIH, NLM, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA;National Center for Biotechnology Information, NIH, NLM, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ontology construction requires an understanding of the meaning and usage of its encoded concepts. While definitions found in dictionaries or glossaries may be adequate for many concepts, the actual usage in expert writing could be a better source of information for many others. The goal of this paper is to describe an automated procedure for finding definitional content in expert writing. The approach uses machine learning on phrasal features to learn when sentences in a book contain definitional content, as determined by their similarity to glossary definitions provided in the same book. The end result is not a concise definition of a given concept, but for each sentence, a predicted probability that it contains information relevant to a definition. The approach is evaluated automatically for terms with explicit definitions, and manually for terms with no available definition.