You talking to me?: a predictive model for zero auxiliary constructions

Authors:
Andrew Caines;Paula Buttery
Affiliations:
University of Cambridge, UK;University of Cambridge, UK
Venue:
NLPLING '10 Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground
Year:
2010

Citing 1
Cited 1

The second release of the RASP system

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions

Measuring language development in early childhood education: a case study of grammar checking in child language transcripts

IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

As a consequence of the established practice to prefer training data obtained from written sources, NLP tools encounter problems in handling data from the spoken domain. However, accurate models of spoken data are increasingly in demand for naturalistic speech generation and machine translations in speech-like contexts (such as chat windows and SMS). There is a widely held assumption in the linguistic field that spoken language is an impoverished form of written language. However, we show that spoken data is not unpredictably irregular and that language models can benefit from detailed consideration of spoken language features. This paper considers one specific construction which is largely restricted to the spoken domain - the ZERO AUXILIARY - and makes a predictive model of that construction for native speakers of British English. The model can predict zero auxiliary occurrence in the BNC with 96.9% accuracy. We will demonstrate how this model can be integrated into existing parsing tools, increasing the number of successful parses for this zero auxiliary construction by around 30%, and thus improving the performance of NLP applications which rely on parsing.