Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Regular expressions for language engineering
Natural Language Engineering
Combining stochastic and rule-based methods for disambiguation in agglutinative languages
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
CLAWS4: the tagging of the British National Corpus
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A WordNet-based approach to Named Entities recognition
SEMANET '02 Proceedings of the 2002 workshop on Building and using semantic networks - Volume 11
Multiflex: A Multilingual Finite-State Tool for Multi-Word Units
CIAA '09 Proceedings of the 14th International Conference on Implementation and Application of Automata
Speech-input multi-target machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Editorial: Introduction to the special issue on multiword expressions: Having a crack at a hard nut
Computer Speech and Language
Multi-word expression identification using sentence surface features
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Identifying multi-word expressions by leveraging morphological and syntactic idiosyncrasy
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Hi-index | 0.00 |
This paper describes the representation of Basque Multiword Lexical Units and the automatic processing of Multiword Expressions. After discussing and stating which kind of multiword expressions we consider to be processed at the current stage of the work, we present the representation schema of the corresponding lexical units in a general-purpose lexical database. Due to its expressive power, the schema can deal not only with fixed expressions but also with morphosyntactically flexible constructions. It also allows us to lemmatize word combinations as a unit and yet to parse the components individually if necessary. Moreover, we describe HABIL, a tool for the automatic processing of these expressions, and we give some evaluation results. This work must be placed in a general framework of written Basque processing tools, which currently ranges from the tokenization and segmentation of single words up to the syntactic tagging of general texts.