The efficient computation of complete and concise substring scales with suffix trees

  • Authors:
  • Sébastien Ferré

  • Affiliations:
  • Irisa/Université de Rennes 1, Rennes cedex, France

  • Venue:
  • ICFCA'07 Proceedings of the 5th international conference on Formal concept analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Strings are an important part of most real application multivalued contexts. Their conceptual treatment requires the definition of substring scales, i.e., sets of relevant substrings, so as to form informative concepts. However these scales are either defined by hand, or derived in a context-unaware manner (e.g., all words occuring in string values). We present an efficient algorithm based on suffix trees that produces complete and concise substring scales. Completeness ensures that every possible concept is formed, like when considering the scale of all substrings. Conciseness ensures the number of scale attributes (substrings) is less than the cumulated size of all string values. This algorithm is integrated in Camelis, and illustrated on the set of all ICCS paper titles.