Automatic detection of idiomatic clauses

  • Authors:
  • Anna Feldman;Jing Peng

  • Affiliations:
  • Department of Computer Science, Montclair State University, Montclair, NJ and Department of Linguistics, Montclair State University, Montclair, NJ;Department of Computer Science, Montclair State University, Montclair, NJ

  • Venue:
  • CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe several experiments whose goal is to automatically identify idiomatic expressions in written text. We explore two approaches for the task: 1) idiom recognition as outlier detection; and 2) supervised classification of sentences. We apply principal component analysis for outlier detection. Detecting idioms as lexical outliers does not exploit class label information. So, in the following experiments, we use linear discriminant analysis to obtain a discriminant subspace and later use the three nearest neighbor classifier to obtain accuracy. We discuss pros and cons of each approach. All the approaches are more general than the previous algorithms for idiom detection --- neither do they rely on target idiom types, lexicons, or large manually annotated corpora, nor do they limit the search space by a particular type of linguistic construction.