Text on tap: the ACL/DCI

Authors:
Mark Liberman
Affiliations:
AT&T Bell Laboratories
Venue:
HLT '89 Proceedings of the workshop on Speech and Natural Language
Year:
1989

Citing 0
Cited 7

Experience with a stack decoder-based HMM CSR and back-OFF N-gram language models

HLT '91 Proceedings of the workshop on Speech and Natural Language
On the interaction between true source, training, and testing language models

HLT '90 Proceedings of the workshop on Speech and Natural Language
Evaluating natural language processing systems

Communications of the ACM
A freely available wide coverage morphological analyzer for English

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
A freely available wide coverage morphological analyzer for English

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
The design for the wall street journal-based CSR corpus

HLT '91 Proceedings of the workshop on Speech and Natural Language
Applying SPHINX-II to the DARPA Wall Street Journal CSR task

HLT '91 Proceedings of the workshop on Speech and Natural Language

Quantified Score

Hi-index	0.02

Visualization

Abstract

There has been a recent upsurge of interest in computational studies of large bodies of text. The aim of such studies varies widely, from lexicography and studies of language change to automatic indexing methods and statistical models for improving the performance of speech recognition systems and optical character readers. In general, corpus-based studies are critical for the development of adequate models of linguistic structure and for insights into the nature of language use. However, research workers have been severely hampered by the lack of appropriate materials, and specifically by the lack of a large enough body of text on which published results can be replicated or extended by others.