Detecting complex predicates in Hindi using POS projection across parallel corpora
MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
Relative compositionality of multi-word expressions: a study of verb-noun (v-n) collocations
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
An Information-Extraction System for Urdu---A Resource-Poor Language
ACM Transactions on Asian Language Information Processing (TALIP)
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Identification of conjunct verbs in hindi and its effect on parsing accuracy
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Extracting and classifying Urdu multiword expressions
HLT-SS '11 Proceedings of the ACL 2011 Student Session
Stepwise mining of multi-word expressions in Hindi
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Detecting noun compounds and light verb constructions: a contrastive study
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Corpus-Based acquisition of support verb constructions for portuguese
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
Hi-index | 0.00 |
Complex predicate is a noun, a verb, an adjective or an adverb followed by a light verb that behaves as a single unit of verb. Complex predicates (CPs) are abundantly used in Hindi and other languages of Indo Aryan family. Detecting and interpreting CPs constitute an important and somewhat a difficult task. The linguistic and statistical methods have yielded limited success in mining this data. In this paper, we present a simple method for detecting CPs of all kinds using a Hindi-English parallel corpus. A CP is hypothesized by detecting absence of the conventional meaning of the light verb in the aligned English sentence. This simple strategy exploits the fact that CP is a multiword expression with a meaning that is distinct from the meaning of the light verb. Although there are several shortcomings in the methodology, this empirical method surprisingly yields mining of CPs with an average precision of 89% and a recall of 90%.