Strength and similarity of affix removal stemming algorithms

  • Authors:
  • William B. Frakes;Christopher J. Fox

  • Affiliations:
  • Virginia Tech;James Madison University

  • Venue:
  • ACM SIGIR Forum
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This study evaluated the strength of, and similarity among, four affix removal stemming algorithms. Strength and similarity were evaluated in different ways, including new metrics based on the Hamming distance measure. Data was collected on stemmer outputs for a list of 49,656 English words derived from the UNIX spelling dictionary and the Moby corpus. Conclusions about the relative strength and similarity of the four stemming algorithms are reported.