A phrase-based statistical model for SMS text normalization
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Investigation and modeling of the structure of texting language
International Journal on Document Analysis and Recognition
Normalizing SMS: are two metaphors better than one?
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Extended gloss overlaps as a measure of semantic relatedness
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Hi-index | 0.00 |
The language used in electronic communications such as emails, chats and SMS texts presents special phenomena and important deviations from natural language. Typical machine translation approaches are difficult to adapt to SMS language due to the many irregularities this kind of language shows. This paper presents a new approach for SMS normalization that combines lexical and phonological translation techniques with disambiguation algorithms at two different levels: lexical and semantic. The results obtained by the system outperform some of the existing methods of SMS normalization despite the fact that the corpus created has some features that complicates the normalization task.