Sequential vs. hierarchical syntactic models of human incremental sentence processing

Authors:
Victoria Fossum;Roger Levy
Affiliations:
University of California, San Diego, La Jolla, CA;University of California, San Diego, La Jolla, CA
Venue:
CMCL '12 Proceedings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics
Year:
2012

Citing 10
Cited 1

Distributed Representations, Simple Recurrent Networks, And Grammatical Structure

Machine Learning - Connectionist approaches to language learning
An efficient probabilistic context-free parsing algorithm that computes prefix probabilities

Computational Linguistics
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Probabilistic top-down parsing and language modeling

Computational Linguistics
PCFG models of linguistic tree representations

Computational Linguistics
A probabilistic earley parser as a psycholinguistic model

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Penn Treebank: annotating predicate argument structure

HLT '94 Proceedings of the workshop on Human Language Technology
Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Tree-bank grammars

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

Incremental, predictive parsing with psycholinguistically motivated tree-adjoining grammar

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Experimental evidence demonstrates that syntactic structure influences human online sentence processing behavior. Despite this evidence, open questions remain: which type of syntactic structure best explains observed behavior--hierarchical or sequential, and lexicalized or unlexicalized? Recently, Frank and Bod (2011) find that unlexicalized sequential models predict reading times better than unlexicalized hierarchical models, relative to a baseline prediction model that takes word-level factors into account. They conclude that the human parser is insensitive to hierarchical syntactic structure. We investigate these claims and find a picture more complicated than the one they present. First, we show that incorporating additional lexical n-gram probabilities estimated from several different corpora into the baseline model of Frank and Bod (2011) eliminates all differences in accuracy between those unlexicalized sequential and hierarchical models. Second, we show that lexicalizing the hierarchical models used in Frank and Bod (2011) significantly improves prediction accuracy relative to the unlexicalized versions. Third, we show that using state-of-the-art lexicalized hierarchical models further improves prediction accuracy. Our results demonstrate that the claim of Frank and Bod (2011) that sequential models predict reading times better than hierarchical models is premature, and also that lexicalization matters for prediction accuracy.