The nature of statistical learning theory
The nature of statistical learning theory
The role of domain information in Word Sense Disambiguation
Natural Language Engineering
Automatic identification of non-compositional phrases
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
An empirical model of multiword expression decomposability
MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Japanese idiom recognition: drawing a line between literal and idiomatic meanings
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Blog categorization exploiting domain dictionary and dynamically estimated domains of unknown words
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Construction of domain dictionary for fundamental vocabulary
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
MWEs as non-propositional content indicators
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Automatic identification of non-compositional multi-word expressions using latent semantic analysis
MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Disambiguating Japanese compound verbs
Computer Speech and Language
Verb noun construction MWE token supervised classification
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Handling sparsity for verb noun MWE token classification
GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Linguistic cues for distinguishing literal and non-literal usages
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Hi-index | 0.00 |
Some phrases can be interpreted either idiomatically (figuratively) or literally in context, and the precise identification of idioms is indispensable for full-fledged natural language processing (NLP). To this end, we have constructed an idiom corpus for Japanese. This paper reports on the corpus and the results of an idiom identification experiment using the corpus. The corpus targets 146 ambiguous idioms, and consists of 102, 846 sentences, each of which is annotated with a literal/idiom label. For idiom identification, we targeted 90 out of the 146 idioms and adopted a word sense disambiguation (WSD) method using both common WSD features and idiom-specific features. The corpus and the experiment are the largest of their kind, as far as we know. As a result, we found that a standard supervised WSD method works well for the idiom identification and achieved an accuracy of 89.25% and 88.86% with/without idiom-specific features and that the most effective idiom-specific feature is the one involving the adjacency of idiom constituents.