An Investigation of an Interontologia: Comparison of the Thousand-Character Text and Roget's Thesaurus

  • Authors:
  • Sang-Rak Kim;Jae-Gun Yang;Jae-Hak J. Bae

  • Affiliations:
  • Institute of e-Vehicle Technology, University of Ulsan/ ITSTAR Co., Ltd., Ulsan, South Korea;School of Computer Engineering & Information Technology, University of Ulsan, Ulsan, South Korea;School of Computer Engineering & Information Technology, University of Ulsan, Ulsan, South Korea

  • Venue:
  • ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The present study presents the lexical category analysis of the Thousand-Character Text and Roget's Thesaurus. Through preprocessing, the Thousand-Character Text and Roget's Thesaurus have been built into databases. In addition, for easier analysis and more efficient research, we have developed a system to search Roget's Thesaurus for the categories corresponding to Chinese characters in the Thousand-Character Text. According to the results of this study, most of the 39 sections of Roget's Thesaurus except the 'Creative Thought' section were relevant to Chinese characters in the Thousand-Character Text. Three sections 'Space in General', 'Dimensions' and 'Matter in General' have higher mapping rate. The correlation coefficient is also around 0.94, showing high category relevancy on the section level between the Thousand-Character Text and Roget's Thesaurus.