The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Extracting the unextractable: a case study on verb-particles
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Statistical measures of the semi-productivity of light verb constructions
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Using small random samples for the manual evaluation of statistical association measures
Computer Speech and Language
Hi-index | 0.00 |
Multiword expressions pose a challenge to the development of large-scale, semantically-rich Natural Language Processing (NLP) systems. We use a bilingual parallel corpus for automatically extracting Light Verb Constructions (LVCs), a very common type of multiword expressions in many languages, including Persian. Using two classifiers, we investigate the usefulness of seven linguistically-informed features for automatically identifying Persian LVCs. To our knowledge, this is the first attempt at the automatic detection of a broad class of Persian LVCs. Results of our experiments show that the proposed features are reasonably successful at the task.