Detecting word misuse in Chinese

Authors:
Wei Liu
Affiliations:
University of Sheffield
Venue:
WSA '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media
Year:
2010

Citing 5
Cited 0

A new statistical approach to Chinese Pinyin input

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Chinese lexical analysis using hierarchical hidden Markov model

SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
A segment-based hidden markov model for real-setting pinyin-to-chinese conversion

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A novel statistical chinese language model and its application in pinyin-to-character conversion

Proceedings of the 17th ACM conference on Information and knowledge management
Chinese Pinyin-Text Conversion on Segmented Text

TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

Social Network Service (SNS) and personal blogs have become the most popular platform for online communication and sharing information. However because most modern computer keyboards are Latin-based, Asian language speakers (such as Chinese) has to rely on a input system which accepts Romanisation of the characters and convert them into characters or words in that language. In Chinese this form of Romanisation (usually called Pinyin) is highly ambiguous, word misuses often occur because the user choose a wrong candidate or deliverately substitute the word with another character string that has the identical Romanisation to convey certain semantics, or to achieve a sarcasm effect. In this paper we aim to develop a system that can automatically identify such word misuse, and suggest the correct word to be used.