Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiword Expressions: A Pain in the Neck for NLP
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Multiword unit hybrid extraction
MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Extracting multiword expressions with a semantic tagger
MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Incorporating non-local information into information extraction systems by Gibbs sampling
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Statistical measures of the semi-productivity of light verb constructions
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
A measure of syntactic flexibility for automatically identifying multiword expressions in corpora
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Semantics-based multiword expression extraction
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Verb noun construction MWE token supervised classification
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Collocation extraction beyond the independence assumption
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Complex predicates annotation in a corpus of Portuguese
LAW IV '10 Proceedings of the Fourth Linguistic Annotation Workshop
NLPLING '10 Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground
Hungarian corpus of light verb constructions
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Multiword expressions in the wild?: the mwetoolkit comes in handy
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations
Automatic extraction of NV expressions in Basque: basic issues on cooccurrence techniques
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Learning English light verb constructions: contextual or statistical
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Identifying and analyzing Brazilian Portuguese complex predicates
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Stepwise mining of multi-word expressions in Hindi
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Detecting noun compounds and light verb constructions: a contrastive study
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
DS'06 Proceedings of the 9th international conference on Discovery Science
Cross-genre and cross-domain detection of semantic uncertainty
Computational Linguistics
Hi-index | 0.00 |
Light verb constructions consist of a verbal and a nominal component, where the noun preserves its original meaning while the verb has lost it (to some degree). They are syntactically flexible and their meaning can only be partially computed on the basis of the meaning of their parts, thus they require special treatment in natural language processing. For this purpose, the first step is to identify light verb constructions. In this study, we present our conditional random fields-based tool—called FXTagger—for identifying light verb constructions. The flexibility of the tool is demonstrated on two, typologically different, languages, namely, English and Hungarian. As earlier studies labeled different linguistic phenomena as light verb constructions, we first present a linguistics-based classification of light verb constructions and then show that FXTagger is able to identify different classes of light verb constructions in both languages. Different types of texts may contain different types of light verb constructions; moreover, the frequency of light verb constructions may differ from domain to domain. Hence we focus on the portability of models trained on different corpora, and we also investigate the effect of simple domain adaptation techniques to reduce the gap between the domains. Our results show that in spite of domain specificities, out-domain data can also contribute to the successful LVC detection in all domains.