The effect of noise in automatic text classification

  • Authors:
  • R. M. Samant;S. Rao

  • Affiliations:
  • MPSTME, SVKM's NMIMS, Mumbai, India;TIMSCDR, Thakur Inst. of Mgmt Studies

  • Venue:
  • Proceedings of the International Conference & Workshop on Emerging Trends in Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Noisy unstructured text is common in informal settings such as on-line chat, SMS, email, newsgroups and blogs, automatically transcribed text from speech, and automatically recognized text from printed or handwritten material. This paper focuses on the issues faced by automatic text classifiers in analyzing noisy documents coming from various sources. The goal of this paper is to bring out and study the effect of noise on automatic text classification. We present detailed experimental results with simulated noise on the Tech-TC300 and 20-newsgroups benchmark datasets.