Multiword Expressions: A Pain in the Neck for NLP
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Named entity recognition through classifier combination
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A method for unsupervised broad-coverage lexical error detection and correction
EdAppsNLP '09 Proceedings of the Fourth Workshop on Innovative Use of NLP for Building Educational Applications
Automated multiword expression prediction for grammar engineering
MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
Classifying factored genres with part-of-speech histograms
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
The StringNet lexico-grammatical knowledgebase and its applications
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
GRASP: grammar- and syntax-based pattern-finder in CALL
IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications
A Computer-Assisted Translation and Writing System
ACM Transactions on Asian Language Information Processing (TALIP)
Towards advanced collocation error correction in Spanish learner corpora
Language Resources and Evaluation
Hi-index | 0.00 |
We describe and motivate the design of a lexico-grammatical knowledgebase called StringNet and illustrate its significance for research into constructional phenomena in English. StringNet consists of a massive archive of what we call hybrid n-grams. Unlike traditional n-grams, hybrid n-grams can consist of any co-occurring combination of POS tags, lexemes, and specific word forms. Further, we detect and represent superordinate and subordinate relations among hybrid n-grams by cross-indexing, allowing the navigation of StringNet through these hierarchies, from specific fixed expressions ("It's the thought that counts") up to their hosting proto-constructions (e.g. the It Cleft construction: "it's the [noun] that [verb]"). StringNet supports discovery of grammatical dependencies (e.g., subject-verb agreement) in non-canonical configurations as well as lexical dependencies (e.g., adjective/noun collocations specific to families of constructions).