A statistical approach to machine translation
Computational Linguistics
A systematic comparison of various statistical alignment models
Computational Linguistics
Real-time automatic insertion of accents in French text
Natural Language Engineering
Inference of string mappings for language technology
Inference of string mappings for language technology
Pronunciation modeling for improved spelling correction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Language and the Internet
A phrase-based statistical model for SMS text normalization
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A survey of types of text noise and techniques to handle noisy text
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
An unsupervised model for text message normalization
CALC '09 Proceedings of the Workshop on Computational Approaches to Linguistic Creativity
SMS based interface for FAQ retrieval
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Rewriting the orthography of sms messages
Natural Language Engineering
Subword variation in text message classification
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A hybrid rule/model-based finite-state framework for normalizing SMS messages
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Handling noisy queries in cross language FAQ retrieval
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Unsupervised cleansing of noisy text
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Contextual bearing on linguistic variation in social media
LSM '11 Proceedings of the Workshop on Languages in Social Media
Experiments with artificially generated noise for cleansing noisy text
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
SMS normalization: combining phonetics, morphology and semantics
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Topics as contextual indicators for word choice in SMS conversations
SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Unsupervised mining of lexical variants from noisy text
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Named entity recognition in tweets: an experimental study
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Short message communications: users, topics, and in-language processing
Proceedings of the 2nd ACM Symposium on Computing for Development
Review: SMS spam filtering: Methods and data
Expert Systems with Applications: An International Journal
Autonomous self-assessment of autocorrections: exploring text message dialogues
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
French presidential elections: what are the most efficient measures for tweets?
Proceedings of the first edition workshop on Politics, elections and data
A broad-coverage normalization system for social media language
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Statistical machine translation enhancements through linguistic levels: A survey
ACM Computing Surveys (CSUR)
Normalization of informal text
Computer Speech and Language
Chinese-English mixed text normalization
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
Electronic written texts used in computermediated interactions (e-mails, blogs, chats, etc) present major deviations from the norm of the language. This paper presents an comparative study of systems aiming at normalizing the orthography of French SMS messages: after discussing the linguistic peculiarities of these messages, and possible approaches to their automatic normalization, we present, evaluate and contrast two systems, one drawing inspiration from the Machine Translation task; the other using techniques that are commonly used in automatic speech recognition devices. Combining both approaches, our best normalization system achieves about 11% Word Error Rate on a test set of about 3000 unseen messages.