Pattern matching and pattern discovery in scientific, program, and document databases

Authors:
Jason T. L. Wang;Kaizhong Zhang;Dennis Shasha
Affiliations:
Department of Computer and Information Science, New Jersey Institute of Technology, University Heights, Newark, New Jersey;Department of Computer Science, The University of Western Ontario, London, Ontario, Canada N6A 5B7;Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, New York
Venue:
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Year:
1995

Citing 2
Cited 4

Combinatorial pattern discovery for scientific data: some preliminary results

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
From structured documents to novel query facilities

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data

Structural matching and discovery in document databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Approximated trial and error analysis in scientific databases

Information Systems - Special issue: Best papers from EDBT 2002
Optimizing Scientific Databases for Client Side Data Processing

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Scientific data repositories: designing for a moving target

Proceedings of the 2003 ACM SIGMOD international conference on Management of data

Quantified Score

Hi-index	0.01

Visualization

Abstract

Over the past several years we have created or borrowed algorithms for combinatorial pattern matching and pattern discovery on sequences [2] and trees.In matching problems, given a pattern, a set of data objects and a distance metric, we find the distance between the pattern and one or more data objects. In discovery problems by contrast, given a set of objects, a metric, and a distance, we seek a pattern that matches many of those objects within the given distance. (So, discovery is a lot like data mining.) Our toolkit performs both matching and discovery with current targeted applications in molecular biology and document comparison.