Towards domain-independent deep linguistic processing: ensuring portability and re-usability of lexicalised grammars

Authors:
Kostadin Cholakov;Valia Kordoni;Yi Zhang
Affiliations:
Saarland University, Germany;Saarland University, Germany and LT-Lab, DFKI GmbH, Germany;Saarland University, Germany and LT-Lab, DFKI GmbH, Germany
Venue:
GEAF '08 Proceedings of the Workshop on Grammar Engineering Across Frameworks
Year:
2008

Citing 5
Cited 3

Introduction to the special issue on the web as corpus

Computational Linguistics - Special issue on web as corpus
A compact architecture for dialogue management based on scripts and meta-outputs

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Error mining for wide-coverage grammar engineering

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Bootstrapping deep lexical resources: resources for courses

DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition

Using unknown word techniques to learn known words

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Acquisition of unknown word paradigms for large-scale grammars

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A machine learning approach to relational noun mining in German

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we illustrate and underline the importance of making detailed linguistic information a central part of the process of automatic acquisition of large-scale lexicons as a means for enhancing robustness and at the same time ensuring maintainability and re-usability of deep lexicalised grammars. Using the error mining techniques proposed in (van Noord, 2004) we show very convincingly that the main hindrance to portability of deep lexicalised grammars to domains other than the ones originally developed in, as well as to robustness of systems using such grammars is low lexical coverage. To this effect, we develop linguistically-driven methods that use detailed morphosyntactic information to automatically enhance the performance of deep lexicalised grammars maintaining at the same time their usually already achieved high linguistic quality.