Pattern matching and pattern discovery in scientific, program, and document databases

  • Authors:
  • Jason T. L. Wang;Kaizhong Zhang;Dennis Shasha

  • Affiliations:
  • Department of Computer and Information Science, New Jersey Institute of Technology, University Heights, Newark, New Jersey;Department of Computer Science, The University of Western Ontario, London, Ontario, Canada N6A 5B7;Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, New York

  • Venue:
  • SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
  • Year:
  • 1995

Quantified Score

Hi-index 0.01

Visualization

Abstract

Over the past several years we have created or borrowed algorithms for combinatorial pattern matching and pattern discovery on sequences [2] and trees.In matching problems, given a pattern, a set of data objects and a distance metric, we find the distance between the pattern and one or more data objects. In discovery problems by contrast, given a set of objects, a metric, and a distance, we seek a pattern that matches many of those objects within the given distance. (So, discovery is a lot like data mining.) Our toolkit performs both matching and discovery with current targeted applications in molecular biology and document comparison.