Multilingual conceptual access to lexicon based on shared orthography: an ontology-driven study of Chinese and Japanese

Authors:
Chu-Ren Huang;Chiyo Hotani;Wan-Ying Lin;Ya-Min Chou;Sheng-Yi Chen
Affiliations:
Academia Sinica, Nanking, Taipei, Taiwan;University of Tuebingen, Tübingen, Deutschland;Academia Sinica, Nanking, Taipei, Taiwan;Ming Chuan University, Taipei, Taiwan;Academia Sinica, Nanking, Taipei, Taiwan
Venue:
COGALEX '08 Proceedings of the workshop on Cognitive Aspects of the Lexicon
Year:
2008

Citing 1
Cited 1

Hanzi grid: toward a knowledge infrastructure for Chinese character-based cultures

IWIC'07 Proceedings of the 1st international conference on Intercultural collaboration

Chinese-Japanese Machine Translation Exploiting Chinese Characters

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we propose a model for conceptual access to multilingual lexicon based on shared orthography. Our proposal relies crucially on two facts: That both Chinese and Japanese conventionally use Chinese orthography in their respective writing systems, and that the Chinese orthography is anchored on a system of radical parts which encodes basic concepts. Each orthographic unit, called hanzi and kanji respectively, contains a radical which indicates the broad semantic class of the meaning of that unit. Our study utilizes the homomorphism between the Chinese hanzi and Japanese kanji systems to identify bilingual word correspondences. We use bilingual dictionaries, including WordNet, to verify semantic relation between the crosslingual pairs. These bilingual pairs are then mapped to an ontology constructed based on relations to the relation between the meaning of each character and the basic concept of their radical parts. The conceptual structure of the radical ontology is proposed as a model for simultaneous conceptual access to both languages. A study based on words containing characters composed of the "(mouth)" radical is given to illustrate the proposal and the actual model. The fact that this model works for two typologically very different languages and that the model contains generative lexicon like coersive links suggests that this model has the conceptual robustness to be applied to other languages.