SemEval-2010 task 12: Parser evaluation using textual entailments
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Who did what to whom?: a contrastive study of syntacto-semantic dependencies
LAW VI '12 Proceedings of the Sixth Linguistic Annotation Workshop
Parser evaluation using textual entailments
Language Resources and Evaluation
Hi-index | 0.00 |
Broad-coverage parsing has come to a point where distinct approaches can offer (seemingly) comparable performance: statistical parsers acquired from the Penn Treebank (PTB); data-driven dependency parsers; "deep" parsers trained off enriched treebanks (in linguistic frameworks like CCG, HPSG, or LFG); and hybrid "deep" parsers, employing hand-built grammars in, for example, HPSG, LFG, or LTAG. Evaluation against trees in the Wall Street Journal (WSJ) section of the PTB has helped advance parsing research over the course of the past decade. Despite some skepticism, the crisp and, over time, stable task of maximizing ParsEval metrics (i.e. constituent labeling precision and recall) over PTB trees has served as a dominating benchmark. However, modern treebank parsers still restrict themselves to only a subset of PTB annotation; there is reason to worry about the idiosyncrasies of this particular corpus; it remains unknown how much the ParsEval metric (or any intrinsic evaluation) can inform NLP application developers; and PTB-style analyses leave a lot to be desired in terms of linguistic information.