A hybrid declarative/procedural metadata mapping language based on python

  • Authors:
  • Greg Janée;James Frew

  • Affiliations:
  • Alexandria Digital Library Project, Institute for Computational Earth System Science, University of California, Santa Barbara, Santa Barbara, CA;Donald Bren School of Environmental Science and Management, University of California, Santa Barbara, Santa Barbara, CA

  • Venue:
  • ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Alexandria Digital Library (ADL) project has been working on automating the processes of building ADL collections and gathering the collection statistics on which ADL's discovery system is based. As part of this effort, we have created a language and supporting programmatic framework for expressing mappings from XML metadata schemas to the required ADL metadata views. This language, based on the Python scripting language, is largely declarative in nature, corresponding to the fact that mappings can be largely—though not entirely—specified by crosswalk-type specifications. At the same time, the language allows mappings to be specified procedurally, which we argue is necessary to deal effectively with the realities of poor quality, highly variable, and incomplete metadata. An additional key feature of the language is the ability to derive new mappings from existing mappings, thereby making it easy to adapt generic mappings to the idiosyncrasies of particular metadata providers. We evaluate this language on three metadata standards (ADN, FGDC, and MARC) and three corresponding collections of metadata. We also note limitations, future research directions, and generalizations of this work.