A machine-translation method for normalization of SMS

Authors:
Darnes Vilariño;David Pinto;Beatriz Beltrán;Saul León;Esteban Castillo;Mireya Tovar
Affiliations:
Faculty of Computer Science, Benemérita Universidad Autónoma de Puebla, Mexico;Faculty of Computer Science, Benemérita Universidad Autónoma de Puebla, Mexico;Faculty of Computer Science, Benemérita Universidad Autónoma de Puebla, Mexico;Faculty of Computer Science, Benemérita Universidad Autónoma de Puebla, Mexico;Faculty of Computer Science, Benemérita Universidad Autónoma de Puebla, Mexico;Faculty of Computer Science, Benemérita Universidad Autónoma de Puebla, Mexico
Venue:
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
Year:
2012

Citing 8
Cited 0

Domain-specific FAQ retrieval using independent aspects

ACM Transactions on Asian Language Information Processing (TALIP)
A reliable FAQ retrieval system using a query log classification technique based on latent semantic analysis

Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
A phrase-based statistical model for SMS text normalization

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Cluster-Based FAQ Retrieval Using Latent Term Weights

IEEE Intelligent Systems
A statistical approach to crosslingual natural language tasks

Journal of Algorithms
SMS based interface for FAQ retrieval

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
High-performance FAQ retrieval using an automatic clustering method of query logs

Information Processing and Management: an International Journal
Handling noisy queries in cross language FAQ retrieval

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Normalization of SMS is a very important task that must be addressed by the computational community because of the tremendous growth of services based on mobile devices, which make use of this kind of messages. There exist many limitations on the automatic treatment of SMS texts derived from the particular writing style used. Even if there are suficient problems dealing with this kind of texts, we are also interested in some tasks requiring to understand the meaning of documents in different languages, therefore, increasing the complexity of such tasks. Our approach proposes to normalize SMS texts employing machine translation techniques. For this purpose, we use a statistical bilingual dictionary calculated on the basis of the IBM-4 model for determining the best translation for a given SMS term. We have compared the presented approach with a traditional probabilistic method of information retrieval, observing that the normalization model proposed here highly improves the performance of the probabilistic one.