I will shoot your shopping down and you can shoot all my tins: automatic lexical acquisition from the CHILDES database

Authors:
Paula Buttery;Anna Korhonen
Affiliations:
University of Cambridge, Cambridge, UK;University of Cambridge, Cambridge, UK
Venue:
CACLA '07 Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition
Year:
2007

Citing 9
Cited 3

A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Automatic extraction of subcategorization from corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
The derivation of a grammatically indexed lexicon from the Longman Dictionary of Contemporary English

ACL '87 Proceedings of the 25th annual meeting on Association for Computational Linguistics
Comlex Syntax: building a computational lexicon

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences

Computational Linguistics
On the robustness of entropy-based similarity measures in evaluation of subcategorization acquisition systems

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Experiments on the Automatic Induction of German Semantic Verb Classes

Computational Linguistics
Automatic measurement of syntactic development in child language

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
The second release of the RASP system

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions

Wide-coverage parsing of speech transcripts

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Searching the annotated Portuguese childes corpora

Proceedings of the Workshop on Computational Models of Language Acquisition and Loss
I say have you say tem: profiling verbs in children data in English and Portuguese

Proceedings of the Workshop on Computational Models of Language Acquisition and Loss

Quantified Score

Hi-index	0.00

Visualization

Abstract

Empirical data regarding the syntactic complexity of children's speech is important for theories of language acquisition. Currently much of this data is absent in the annotated versions of the childes database. In this perliminary study, we show that a state-of-the-art subcategorization acquisition system of Preiss et al. (2007) can be used to extract large-scale subcategorization (frequency) information from the (i) child and (ii) child-directed speech within the childes database without any domain-specific tuning. We demonstrate that the acquired information is sufficiently accurate to confirm and extend previously reported research findings. We also report qualitative results which can be used to further improve parsing and lexical acquisition technology for child language data in the future.