A corpus analysis of simple account texts and the proposal of simplification strategies: first steps towards text simplification systems

  • Authors:
  • Sandra M. Aluísio;Lucia Specia;Thiago A. S. Pardo;Erick G. Maziero;Helena M. Caseli;Renata P. M. Fortes

  • Affiliations:
  • Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil;Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil;Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil;Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil;Núcleo Interinstitucional de Lingüística Computacional (NILC), São Carlos/SP, Brasil;Universidade de São Paulo, São Carlos/SP, Brasil

  • Venue:
  • Proceedings of the 26th annual ACM international conference on Design of communication
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we investigate the main linguistic phenomena that can make texts complex and how they could be simplified. We focus on a corpus analysis of simple account texts available on the web for Brazilian Portuguese (BP). This study illustrates the need for text simplification to facilitate accessibility to information by poor readers and by people with cognitive disabilities. It also highlights features of simplification for BP, which may differ from other languages. Moreover, we propose simplification strategies and a Simplification Annotation Editor. This study consists of the first step towards building BP text simplification systems. One of the scenarios in which these systems could be used is that of reading electronic texts produced, e.g., by the Brazilian government or by news agencies.