Measuring and coding language change: An evolving study in a multilayer corpus architecture

  • Authors:
  • Hagen Hirschmann;Anke Lüdeling;Amir Zeldes

  • Affiliations:
  • Humboldt-Universität zu Berlin, Germany;Humboldt-Universität zu Berlin, Germany;Humboldt-Universität zu Berlin, Germany

  • Venue:
  • Journal on Computing and Cultural Heritage (JOCCH)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Our article explores the possibilities of using deeply annotated, incrementally evolving comparable corpora for the study of language change, in this case for different stages from Old High German to New High German. Using the example of the evolution of German past tenses, we show how a variety of categories ranging from low to high complexity interact with the choice between competing linguistic variants. To adequately explore the influence of these categories, we use a multilayer corpus architecture that develops together with our study. We show that a combination of quantitative and qualitative analyses can recognize relevant contextual factors, which feed into the addition of new annotation layers applying to the same data. By making our categorizations explicit as corpus annotations and our data available to other researchers, we promote an open, extensible, and transparent mode of research, where both raw data and the inferential process are exposed to other researchers.