Text classification: a recent overview
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
A survey of types of text noise and techniques to handle noisy text
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
The effect of noise in automatic text classification
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Hi-index | 0.00 |
This work presents a system for the categorization of noisy texts. By noisy it is meant any text obtained through an extraction process (affected by errors) from media different than digital texts. We show that, even with an average Word Error Rate of around 50%, the categorization performance loss with respect to the clean version of the same documents is negligible.