On the Structure of Consistent Partitions of Substring Set of a Word

  • Authors:
  • Meng Zhang;Yi Zhang;Liang Hu;Peichen Xin

  • Affiliations:
  • College of Computer Science and Technology, Jilin University, Changchun, China 130012;College of Computer Science and Technology, Jilin University, Changchun, China 130012;College of Computer Science and Technology, Jilin University, Changchun, China 130012;College of Computer Science and Technology, Jilin University, Changchun, China 130012

  • Venue:
  • FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

DAWG is a key data structure for string matching and it is widely used in bioinformatics and data compression. But DAWGs are memory greedy. Weighted directed word graph (WDWG) is a space-economical variation of DAWG which is as powerful as DAWG. The underlay concept of WDWGs is a new equivalent relation of the substrings of a word, namely the minimal consistent linear partition. However, the structure of the consistent linear partition is not extensively explored. In this paper, we present a theorem that gives insight into the structure of consistent partitions. Through this theorem, one can enumerate all the consistent linear partitions and verify whether a linear partition is consistent. It also demonstrates how to merge the DAWG into a consistent partition. In the end, we give a simple and easy-to-construct class of consistent partitions based on lexicographic order.