Fast construction of generalized suffix trees over a very large alphabet

  • Authors:
  • Zhixiang Chen;Richard Fowler;Ada Wai-Chee Fu;Chunyue Wang

  • Affiliations:
  • Department of Computer Science, University of Texas-Pan American, Edinburg, TX;Department of Computer Science, University of Texas-Pan American, Edinburg, TX;Department of Computer Science, Chinese University of Hong Kong, Shatin, N.T., Hong Kong;Department of Computer Science, University of Texas-Pan American, Edinburg, TX

  • Venue:
  • COCOON'03 Proceedings of the 9th annual international conference on Computing and combinatorics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The work in this paper is motivated by the real-world problems such as mining frequent traversal path patterns from very large Web logs. Generalized suffix trees over a very large alphabet can be used to solve such problems. However, traditional algorithms such as the Weiner, Ukkonen and McCreight algorithms are not sufficient assurance of practicality because of large magnitudes of the alphabet and the set of strings in those real-world problems. Two new algorithms are designed for fast construction of generalized suffix trees over a very large alphabet, and their performance is analyzed in comparison with the well-known Ukkonen algorithm. It is shown that these two algorithms have better performance, and can deal with large alphabets and large string sets well.