Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation

  • Authors:
  • Johan Bos;Edward Briscoe;Aoife Cahill;John Carroll;Stephen Clark;Ann Copestake;Dan Flickinger;Josef van Genabith;Julia Hockenmaier;Aravind Joshi;Ronald Kaplan;Tracy Holloway King;Sandra Kubler;Dekang Lin;Jan Tore Lonning;Christopher Manning;Yusuke Miyao;Joakim Nivre;Stephan Oepen;Kenji Sagae;Nianwen Xue;Yi Zhang

  • Affiliations:
  • University of Rome "La Sapienza" (Italy);University of Cambridge (UK);University of Stuttgart (Germany);University of Sussex (UK);Oxford University (UK);University of Cambridge (UK);Stanford University;Dublin City University (Ireland);University of Illinois at Urbana-Champaign;University of Pennsylvania;Powerset, Inc.;PARC;Indiana University;Google Inc.;University of Oslo (Norway);Stanford University;University of Tokyo (Japan);Vaxjo and Uppsala Universities (Sweden);University of Oslo (Norway) and CSLI Stanford;University of Southern California;University of Colorado;DFKI GmbH and Saarland University (Germany)

  • Venue:
  • CrossParser '08 Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Broad-coverage parsing has come to a point where distinct approaches can offer (seemingly) comparable performance: statistical parsers acquired from the Penn Treebank (PTB); data-driven dependency parsers; "deep" parsers trained off enriched treebanks (in linguistic frameworks like CCG, HPSG, or LFG); and hybrid "deep" parsers, employing hand-built grammars in, for example, HPSG, LFG, or LTAG. Evaluation against trees in the Wall Street Journal (WSJ) section of the PTB has helped advance parsing research over the course of the past decade. Despite some skepticism, the crisp and, over time, stable task of maximizing ParsEval metrics (i.e. constituent labeling precision and recall) over PTB trees has served as a dominating benchmark. However, modern treebank parsers still restrict themselves to only a subset of PTB annotation; there is reason to worry about the idiosyncrasies of this particular corpus; it remains unknown how much the ParsEval metric (or any intrinsic evaluation) can inform NLP application developers; and PTB-style analyses leave a lot to be desired in terms of linguistic information.