Searching in Parallel for Similar Strings

Authors:
Isidore Rigoutsos;Andrea Califano
Affiliations:
-;-
Venue:
IEEE Computational Science & Engineering
Year:
1994

Citing 4
Cited 7

An improved algorithm for approximate string matching

SIAM Journal on Computing
Massively parallel Bayesian object recognition

Massively parallel Bayesian object recognition
A Bayesian approach to model matching with geometric hashing

Computer Vision and Image Understanding
FLASH: A Fast Look-Up Algorithm for String Homology

Proceedings of the 1st International Conference on Intelligent Systems for Molecular Biology

Managing Statistical Behavior of Large Data Sets in Shared-Nothing Architectures

IEEE Transactions on Parallel and Distributed Systems
A dictionary based approach for gene annotation

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Geometric Hashing: An Overview

IEEE Computational Science & Engineering
Fingerprint Matching Using Transformation Parameter Clustering

IEEE Computational Science & Engineering
Well-Behaved, Tunable 3D-Affine Invariants

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
A New Density-Based Scheme for Clustering Based on Genetic Algorithm

Fundamenta Informaticae
A New Density-Based Scheme for Clustering Based on Genetic Algorithm

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed computation, probabilistic indexing and hashing techniques combine to create a novel approach to processing very large biological-sequence databases. Other data-intensive tasks could also benefit. Our indexing-based approach enables fast similarity searching through a large database of strings. Thanks to a redundant table-lookup scheme, recovering database items that match a test sequence requires minimal data access. We have implemented a uniprocessor version of this approach called Flash (Fast Lookup Algorithm for String Homology) as well as a distributed version, dFlash, using a cluster of seven non-dedicated workstations connected through a local area network. In this article, we present an approach for retrieving homologies in databases of proteins.