Building a BioWordNet by using WordNet's data formats and WordNet's software infrastructure: a failure story

  • Authors:
  • Michael Poprat;Elena Beisswanger;Udo Hahn

  • Affiliations:
  • Friedrich-Schiller-Universität Jena Jena, Germany;Friedrich-Schiller-Universität Jena Jena, Germany;Friedrich-Schiller-Universität Jena Jena, Germany

  • Venue:
  • SETQA-NLP '08 Software Engineering, Testing, and Quality Assurance for Natural Language Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe our efforts to build on WordNet resources, using WordNet lexical data, the data format that it comes with and WordNet's software infrastructure in order to generate a biomedical extension of WordNet, the BioWordNet. We began our efforts on the assumption that the software resources were stable and reliable. In the course of our work, it turned out that this belief was far too optimistic. We discuss the stumbling blocks that we encountered, point out an error in the WordNet software with implications for research based on it, and conclude that building on the legacy of WordNet data structures and its associated software might preclude sustainable extensions that go beyond the domain of general English.