Simple English Wikipedia: a new text simplification task

  • Authors:
  • William Coster;David Kauchak

  • Affiliations:
  • Pomona College, Claremont, CA;Pomona College, Claremont, CA

  • Venue:
  • HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we examine the task of sentence simplification which aims to reduce the reading complexity of a sentence by incorporating more accessible vocabulary and sentence structure. We introduce a new data set that pairs English Wikipedia with Simple English Wikipedia and is orders of magnitude larger than any previously examined for sentence simplification. The data contains the full range of simplification operations including rewording, reordering, insertion and deletion. We provide an analysis of this corpus as well as preliminary results using a phrase-based translation approach for simplification.