The OLAC metadata set and controlled vocabularies

  • Authors:
  • Steven Bird;Gary Simons

  • Affiliations:
  • University of Pennsylvania, Philadelphia, PA;SIL International, Dallas, TX

  • Venue:
  • STAR '01 Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources - Volume 15
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

As language data and associated technologies proliferate and as the language resources community rapidly expands, it has become difficult to locate and reuse existing resources. Are there any lexical resources for such-and-such a language? What tool can work with transcripts in this particular format? What is a good format to use for linguistic data of this type? Questions like these dominate many mailing lists, since web search engines are an unreliable way to find language resources. This paper describes a new digital infrastructure for language resource discovery, based on the Open Archives Initiative, and called OLAC -- the Open Language Archives Community. The OLAC Metadata Set and the associated controlled vocabularies facilitate consistent description and focussed searching. We report progress on the metadata set and controlled vocabularies, describing current issues and soliciting input from the language resources community.