Another stemmer

  • Authors:
  • Chris D. Paice

  • Affiliations:
  • -

  • Venue:
  • ACM SIGIR Forum
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

In natural language processing, conflation is the process of merging or lumping together nonidentical words which refer to the same principal concept. This can relate both to words which are entirely different in form (e.g., "group" and "collection"), and to words which share some common root (e.g., "group", "grouping", "subgroups"). In the former case the words can only be mapped by referring to a dictionary or thesaurus, but in the latter case use can be made of the orthographic similarities between the forms. One popular approach is to remove affixes from the input words, thus reducing them to a stem; if this could be done correctly, all the variant forms of a word would be converted to the same standard form. Since the process is aimed at mapping for retrieval purposes, the stem need not be a linguistically correct lemma or root (see also Frakes 1982).