N-gram analysis based on zero-suppressed BDDs

  • Authors:
  • Ryutaro Kurai;Shin-ichi Minato;Thomas Zeugmann

  • Affiliations:
  • Hokkaido University, Sapporo, Japan;Hokkaido University, Sapporo, Japan;Hokkaido University, Sapporo, Japan

  • Venue:
  • JSAI'06 Proceedings of the 20th annual conference on New frontiers in artificial intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the present paper, we propose a new method of n-gram analysis using ZBDDs (Zero-suppressed BDDs). ZBDDs are known as a compact representation of combinatorial item sets. Here, we newly apply the ZBDD-based techniques for efficiently handling sets of sequences. Using the algebraic operations defined over ZBDDs, such as union, intersection, difference, etc., we can execute various processings and/or analyses for large-scale sequence data. We conducted experiments for generating n-gram statistical data for given real document files. The obtained results show the potentiality of the ZBDD-based method for the sequence database analysis.