Modelling the orthographic neighbourhood for japanese kanji

Authors:
Lars Yencken;Timothy Baldwin
Affiliations:
Computer Science and Software Engineering, University of Melbourne, Victoria, Australia;Computer Science and Software Engineering, University of Melbourne, Victoria, Australia
Venue:
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Year:
2006

Citing 2
Cited 3

The kappa statistic: a second look

Computational Linguistics
Building a graphetic dictionary for Japanese kanji: character look up based on brush strokes or stroke groups, and the display of kanji as path data

ElectricDict '04 Proceedings of the Workshop on Enhancing and Using Electronic Dictionaries

Kansuke: A logograph look-up interface based on a few modified stroke prototypes

ACM Transactions on Computer-Human Interaction (TOCHI)
Orthographic similarity search for dictionary lookup of Japanese words

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Measuring and predicting orthographic associations: modelling the similarity of Japanese kanji

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1

Quantified Score

Hi-index	0.01

Visualization

Abstract

Japanese kanji recognition experiments are typically narrowly focused, and feature only native speakers as participants. It remains unclear how to apply their results to kanji similarity applications, especially when learners are much more likely to make similarity-based confusion errors. We describe an experiment to collect authentic human similarity judgements from participants of all levels of Japanese proficiency, from non-speaker to native. The data was used to construct simple similarity models for kanji based on pixel difference and radical cosine similarity, in order to work towards genuine confusability data. The latter model proved the best predictor of human responses.