Semantic categorization of contextual features based on wordnet for g-to-p conversion of arabic numerals combined with homographic classifiers

  • Authors:
  • Youngim Jung;Aesun Yoon;Hyuk-Chul Kwon

  • Affiliations:
  • Department of Computer Science and Engineering, Pusan National University, Busan, S. Korean;Department of French, Pusan National University, Busan, S. Korean;Department of Computer Science and Engineering, Pusan National University, Busan, S. Korean

  • Venue:
  • AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Arabic numerals show a high occurrence-frequency and deliver significant senses, especially in scientific or informative texts. The problem, how to convert Arabic numerals to phonemes with ambiguous classifiers in Korean, is not easily resolved. In this paper, the ambiguities of Arabic numerals combined with homographic classifiers are analyzed and the resolutions for their sense disambiguation based on KorLex (Korean Lexico-Semantic Network) are proposed. Words proceeding or following the Arabic Numerals are categorized into 54 semantic classes based on the lexical hierarchy in KorLex 1.0. The semantic classes are trained to classify the meaning and the reading of Arabic Numerals using a decision tree. The proposed model shows 87.3% accuracy which is 14.1% higher than the baseline.