Introducing dependencies into alignment analysis and its use for local structure prediction in proteins

Authors:
Szymon Nowakowski;Krzysztof Fidelis;Jerzy Tiuryn
Affiliations:
Institute of Informatics, Warsaw University, Warszawa, Poland;Genome Center, University of California, Davis, Genome and Biomedical Sciences Facility, Davis, CA;Institute of Informatics, Warsaw University, Warszawa, Poland
Venue:
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Year:
2005

Citing 5
Cited 1

Detecting non-adjoining correlations with signals in DNA

RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
Modeling dependencies in protein-DNA binding sites

RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Using Dirichlet Mixture Priors to Derive Hidden Markov Models for Protein Families

Proceedings of the 1st International Conference on Intelligent Systems for Molecular Biology
REGULARIZERS FOR ESTIMATING DISTRIBUTIONS OF AMINO ACIDS FROM SMALL SAMPLES

REGULARIZERS FOR ESTIMATING DISTRIBUTIONS OF AMINO ACIDS FROM SMALL SAMPLES
A novel approach to fold recognition using sequence-derived properties from sets of structurally similar local fragments of proteins

Bioinformatics

A new approach to the assessment of the quality of predictions of transcription factor binding sites

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we explore several techniques of analysing sequence alignments. Their main idea is to generalize an alignment by means of a probability distribution. The Dirichlet mixture method is used as a reference to assess new techniques. They are compared based on a cross validation test with both synthetic and real data: we use them to identify sequence-structure relationships between target protein and possible local motifs. We show that the Beta method is almost as successful as the reference method, but it is much faster (up to 17 times). MAP (Maximum a Posteriori) estimation for two PSSMs (Position Specific Score Matrices) introduces dependencies between columns of an alignment. It is shown in our experiments to be much more successful than the reference method, but it is very computationally expensive. To this end we developed its parallel implementation.