Evaluating text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Fundamentals of speech recognition
Fundamentals of speech recognition
A technique for computer detection and correction of spelling errors
Communications of the ACM
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Named entity extraction from noisy input: speech and OCR
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
Automatic generation of domain models for call centers from noisy transcriptions
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Context-based speech recognition error detection and correction
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Semantic annotation of unstructured and ungrammatical text
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Hi-index | 0.00 |
Noisy unstructured text is common in informal settings such as on-line chat, SMS, email, newsgroups and blogs, automatically transcribed text from speech, and automatically recognized text from printed or handwritten material. This paper focuses on the issues faced by automatic text classifiers in analyzing noisy documents coming from various sources. The goal of this paper is to bring out and study the effect of noise on automatic text classification. We present detailed experimental results with simulated noise on the Tech-TC300 and 20-newsgroups benchmark datasets.