For the sake of simplicity: unsupervised extraction of lexical simplifications from Wikipedia

  • Authors:
  • Mark Yatskar;Bo Pang;Cristian Danescu-Niculescu-Mizil;Lillian Lee

  • Affiliations:
  • -;-;-;-

  • Venue:
  • HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We report on work in progress on extracting lexical simplifications (e.g., "collaborate" → "work together"), focusing on utilizing edit histories in Simple English Wikipedia for this task. We consider two main approaches: (1) deriving simplification probabilities via an edit model that accounts for a mixture of different operations, and (2) using metadata to focus on edits that are more likely to be simplification operations. We find our methods to outperform a reasonable baseline and yield many high-quality lexical simplifications not included in an independently-created manually prepared list.