Online Recognition of Chinese Characters: The State-of-the-Art
IEEE Transactions on Pattern Analysis and Machine Intelligence
Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
Search Engines: Information Retrieval in Practice
Search Engines: Information Retrieval in Practice
Using structural information for identifying similar Chinese characters
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Two Applications of Lexical Information to Computer-Assisted Item Authoring for Elementary Chinese
IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
Introduction to Algorithms, Third Edition
Introduction to Algorithms, Third Edition
Capturing errors in written Chinese words
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Phonological and logographic influences on errors in written Chinese words
ALR7 Proceedings of the 7th Workshop on Asian Language Resources
ACM Transactions on Asian Language Information Processing (TALIP)
A cognition-based interactive game platform for learning Chinese characters
Proceedings of the 2011 ACM Symposium on Applied Computing
Why press backspace?: understanding user input behaviors in Chinese Pinyin input method
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Hi-index | 0.00 |
Visually and phonologically similar characters are major contributing factors for errors in Chinese text. By defining appropriate similarity measures that consider extended Cangjie codes, we can identify visually similar characters within a fraction of a second. Relying on the pronunciation information noted for individual characters in Chinese lexicons, we can compute a list of characters that are phonologically similar to a given character. We collected 621 incorrect Chinese words reported on the Internet, and analyzed the causes of these errors. 83% of these errors were related to phonological similarity, and 48% of them were related to visual similarity between the involved characters. Generating the lists of phonologically and visually similar characters, our programs were able to contain more than 90% of the incorrect characters in the reported errors.