The study of informality as a framework for evaluating the normalisation of web 2.0 texts

  • Authors:
  • Alejandro Mosquera;Paloma Moreda

  • Affiliations:
  • DLSI, University of Alicante, Alicante, Spain;DLSI, University of Alicante, Alicante, Spain

  • Venue:
  • NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The language used in Web 2.0 applications such as blogging platforms, realtime chats, social networks or collaborative encyclopaedias shows remarkable differences in comparison with traditional texts. The presence of informal features such as emoticons, spelling errors or Internet-specific slang can lower the performance of Natural Language Processing applications. In order to overcome this problem, text normalisation approaches can provide a clean word or sentence by transforming all non-standard lexical or syntactic variations into their canonical forms. Nevertheless, because the characteristics of each normalisation approach there exist different performance metrics and evaluation procedures. We hypothesize that the analysis of informality levels can be used to evaluate text normalization techniques. Thus, in this study we are going to propose a text normalisation evaluation framework using informality levels and its application to Web 2.0 texts.