Semantic clustering: an attempt to identify multiword expressions in Bengali

  • Authors:
  • Tanmoy Chakraborty;Dipankar Das;Sivaji Bandyopadhyay

  • Affiliations:
  • Jadavpur University, Kolkata, India;Jadavpur University, Kolkata, India;Jadavpur University, Kolkata, India

  • Venue:
  • MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the key issues in both natural language understanding and generation is the appropriate processing of Multiword Expressions (MWEs). MWE can be defined as a semantic issue of a phrase where the meaning of the phrase may not be obtained from its constituents in a straightforward manner. This paper presents an approach of identifying bigram noun-noun MWEs from a medium-size Bengali corpus by clustering the semantically related nouns and incorporating a vector space model for similarity measurement. Additional inclusion of the English WordNet::Similarity module also improves the results considerably. The present approach also contributes to locate clusters of the synonymous noun words present in a document. Experimental results draw a satisfactory conclusion after analyzing the Precision, Recall and F-score values.