Parser evaluation across frameworks without format conversion

Authors:
Wai Lok Tam;Yo Sato;Yusuke Miyao;Jun-ichi Tsujii
Affiliations:
University of Tokyo, Tokyo, Japan;Queen Mary University of London, London, U.K.;University of Tokyo, Tokyo, Japan;University of Tokyo, Tokyo, Japan
Venue:
CrossParser '08 Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation
Year:
2008

Citing 3
Cited 0

The syntactic process

The syntactic process
TSNLP: Test Suites for Natural Language Processing

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
The second release of the RASP system

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the area of parser evaluation, formats like GR and SD which are based on dependencies, the simplest representation of syntactic information, are proposed as framework-independent metrics for parser evaluation. The assumption behind these proposals is that the simplicity of dependencies would make conversion from syntactic structures and semantic representations used in other formalisms to GR/SD a easy job. But (Miyao et al., 2007) reports that even conversion between these two formats is not easy at all. Not to mention that the 80% success rate of conversion is not meaningful for parsers that boast 90% accuracy. In this paper, we make an attempt at evaluation across frameworks without format conversion. This is achieved by generating a list of names of phenomena with each parse. These names of phenomena are matched against the phenomena given in the gold standard. The number of matches found is used for evaluating the parser that produces the parses. The evaluation method is more effective than evaluation methods which involve format conversion because the generation of names of phenomena from the output of a parser loaded is done by a recognizer that has a 100% success rate of recognizing a phenomenon illustrated by a sentence. The success rate is made possible by the reuse of native codes: codes used for writing the parser and rules of the grammar loaded into the parser.