Self-organized language modeling for speech recognition
Readings in speech recognition
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
A Winnow-Based Approach to Context-Sensitive Spelling Correction
Machine Learning - Special issue on natural language learning
A technique for computer detection and correction of spelling errors
Communications of the ACM
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
A spelling correction program based on a noisy channel model
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Bootstrapping bilingual data using consensus translation for a multilingual instant messaging system
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Extracting paraphrases from a parallel corpus
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Pronunciation modeling for improved spelling correction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Normalizing SMS: are two metaphors better than one?
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
An unsupervised model for text message normalization
CALC '09 Proceedings of the Workshop on Computational Approaches to Linguistic Creativity
SMS based interface for FAQ retrieval
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Rewriting the orthography of sms messages
Natural Language Engineering
"cba to check the spelling" investigating parser performance on discussion forum posts
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A hybrid rule/model-based finite-state framework for normalizing SMS messages
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Handling noisy queries in cross language FAQ retrieval
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Unsupervised cleansing of noisy text
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Lexical normalisation of short text messages: makn sens a #twitter
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Contextual bearing on linguistic variation in social media
LSM '11 Proceedings of the Workshop on Languages in Social Media
Experiments with artificially generated noise for cleansing noisy text
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
SMS normalization: combining phonetics, morphology and semantics
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Topics as contextual indicators for word choice in SMS conversations
SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Unsupervised mining of lexical variants from noisy text
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
A machine-translation method for normalization of SMS
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
The study of informality as a framework for evaluating the normalisation of web 2.0 texts
NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
Autonomous self-assessment of autocorrections: exploring text message dialogues
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Processing informal, romanized Pakistani text messages
LSM '12 Proceedings of the Second Workshop on Language in Social Media
Personalized normalization for a multilingual chat system
ACL '12 Proceedings of the ACL 2012 System Demonstrations
A broad-coverage normalization system for social media language
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Source language adaptation for resource-poor machine translation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Automatically constructing a normalisation dictionary for microblogs
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Lexical normalization for social media text
ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
Automatic normalization of short texts by combining statistical and rule-based techniques
Language Resources and Evaluation
Statistical machine translation enhancements through linguistic levels: A survey
ACM Computing Surveys (CSUR)
Normalization of informal text
Computer Speech and Language
Chinese-English mixed text normalization
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
Short Messaging Service (SMS) texts behave quite differently from normal written texts and have some very special phenomena. To translate SMS texts, traditional approaches model such irregularities directly in Machine Translation (MT). However, such approaches suffer from customization problem as tremendous effort is required to adapt the language model of the existing translation system to handle SMS text style. We offer an alternative approach to resolve such irregularities by normalizing SMS texts before MT. In this paper, we view the task of SMS normalization as a translation problem from the SMS language to the English language and we propose to adapt a phrase-based statistical MT model for the task. Evaluation by 5-fold cross validation on a parallel SMS normalized corpus of 5000 sentences shows that our method can achieve 0.80702 in BLEU score against the baseline BLEU score 0.6958. Another experiment of translating SMS texts from English to Chinese on a separate SMS text corpus shows that, using SMS normalization as MT preprocessing can largely boost SMS translation performance from 0.1926 to 0.3770 in BLEU score.