Disambiguating prepositional phrase attachments by using on-line dictionary definitions
Computational Linguistics - Special issue of the lexicon
A computational model of language performance: Data Oriented Parsing
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Surface grammatical analysis for the extraction of terminological noun phrases
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 4
Natural Language Processing and Digital Libraries
Information Extraction: Towards Scalable, Adaptable Systems
Recycling terms into a partial parser
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Term extraction + term clustering: an integrated platform for computer-aided terminology
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Expansion of multi-word terms for indexing and retrieval using morphology and syntax
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Symbolic word clustering for medium-size corpora
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Projecting corpus-based semantic links on a thesaurus
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
TExtractor: a multilingual terminology extraction tool
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Improving term extraction with terminological resources
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Hi-index | 0.00 |
In this paper, we describe a method for structural noun phrase disambiguation which mainly relies on the examination of the text corpus under analysis and doesn't need to integrate any domain-dependent lexico- or syntactico-semantic information. This method is implemented in the Terminology Extraction Sottware LEXTER. We first explain why the integration of LEXTER in the LEXTER-K project, which aims at building a tool for knowledge extraction from large technical text corpora, requires improving the quality of the terminolgy extracted by LEXTER. Then we briefly describe the way LEXTER works and show what kind of disambiguation it has to perform when parsing "maximal-length" noun phrases. We introduce a method of disambiguation which relies on a very simple idea: whenever LEXTER has to choose among several competing noun sub-groups in order to disambiguate a maximal-length noun phrase, it checks each of these sub-groups if it occurs anywhere else in the corpus in a non-ambiguous situation, and then it makes a choice. The half-a-million words corpus analysis resulted in an efficient strategy of disambiguation. The average rates are:27% no disambiguation70% correct disambiguation3% wrong disambiguation