Text on tap: the ACL/DCI

  • Authors:
  • Mark Liberman

  • Affiliations:
  • AT&T Bell Laboratories

  • Venue:
  • HLT '89 Proceedings of the workshop on Speech and Natural Language
  • Year:
  • 1989

Quantified Score

Hi-index 0.02

Visualization

Abstract

There has been a recent upsurge of interest in computational studies of large bodies of text. The aim of such studies varies widely, from lexicography and studies of language change to automatic indexing methods and statistical models for improving the performance of speech recognition systems and optical character readers. In general, corpus-based studies are critical for the development of adequate models of linguistic structure and for insights into the nature of language use. However, research workers have been severely hampered by the lack of appropriate materials, and specifically by the lack of a large enough body of text on which published results can be replicated or extended by others.