Automatic identification of persian light verb constructions

Authors:
Bahar Salehi;Narjes Askarian;Afsaneh Fazly
Affiliations:
School of Electrical and Computer Engineering, Shiraz University, Iran;School of Electrical and Computer Engineering, Shiraz University, Iran;School of Computer Science, Institute for Research in Fundamental Sciences (IPM), Iran
Venue:
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Year:
2012

Citing 7
Cited 0

The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Improved statistical alignment models

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Extracting the unextractable: a case study on verb-particles

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Statistical measures of the semi-productivity of light verb constructions

MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures

MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Using small random samples for the manual evaluation of statistical association measures

Computer Speech and Language
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiword expressions pose a challenge to the development of large-scale, semantically-rich Natural Language Processing (NLP) systems. We use a bilingual parallel corpus for automatically extracting Light Verb Constructions (LVCs), a very common type of multiword expressions in many languages, including Persian. Using two classifiers, we investigate the usefulness of seven linguistically-informed features for automatically identifying Persian LVCs. To our knowledge, this is the first attempt at the automatic detection of a broad class of Persian LVCs. Results of our experiments show that the proposed features are reasonably successful at the task.