Graph-based clustering for semantic classification of onomatopoetic words

  • Authors:
  • Kenichi Ichioka;Fumiyo Fukumoto

  • Affiliations:
  • University of Yamanashi, Japan;University of Yamanashi, Japan

  • Venue:
  • TextGraphs-3 Proceedings of the 3rd Textgraphs Workshop on Graph-Based Algorithms for Natural Language Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a method for semantic classification of onomatopoetic words like "[Abstract contained text which could not be displayed.] (hum)" and "[Abstract contained text which could not be displayed.] (clip clop)" which exist in every language, especially Japanese being rich in onomatopoetic words. We used a graph-based clustering algorithm called Newman clustering. The algorithm calculates a simple quality function to test whether a particular division is meaningful. The quality function is calculated based on the weights of edges between nodes. We combined two different similarity measures, distributional similarity, and orthographic similarity to calculate weights. The results obtained by using the Web data showed a 9.0% improvement over the baseline single distributional similarity measure.