Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Effects of OCR errors on ranking and feedback using the vector space model
Information Processing and Management: an International Journal
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Patterns of search: analyzing and modeling Web query refinement
UM '99 Proceedings of the seventh international conference on User modeling
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Named entity extraction from noisy input: speech and OCR
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
Automatic construction of machine translation knowledge using translation literalness
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Improving information extraction by modeling errors in speech recognizer output
HLT '01 Proceedings of the first international conference on Human language technology research
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic generation of domain models for call centers from noisy transcriptions
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Exploring distributional similarity based models for query spelling correction
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Error detection using linguistic features
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Learning a spelling error model from search query logs
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Investigation and modeling of the structure of texting language
International Journal on Document Analysis and Recognition
Optical character recognition errors and their effects on natural language processing
Proceedings of the second workshop on Analytics for noisy unstructured text data
How Much Noise Is Too Much: A Study in Automatic Text Classification
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Query suggestion using hitting time
Proceedings of the 17th ACM conference on Information and knowledge management
Identification of class specific discourse patterns
Proceedings of the 17th ACM conference on Information and knowledge management
Business Intelligence from Voice of Customer
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Normalizing SMS: are two metaphors better than one?
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
SMS based interface for FAQ retrieval
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Automatic filtering of bilingual corpora for statistical machine translation
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Statistical machine translation of texts with misspelled words
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Statement map: reducing web information credibility noise through opinion classification
AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
An approach for adding noise-tolerance to restricted-domain information retrieval
NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
Experiments with artificially generated noise for cleansing noisy text
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Adapting a WSJ trained part-of-speech tagger to noisy text: preliminary results
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Journal of Data and Information Quality (JDIQ)
A broad-coverage normalization system for social media language
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
Often, in the real world noise is ubiquitous in text communications. Text produced by processing signals intended for human use are often noisy for automated computer processing. Automatic speech recognition, optical character recognition and machine translation all introduce processing noise. Also digital text produced in informal settings such as online chat, SMS, emails, message boards, newsgroups, blogs, wikis and web pages contain considerable noise. In this paper, we present a survey of the existing measures for noise in text. We also cover application areas that ingest this noisy text for various tasks like Information Retrieval and Information Extraction.