Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Finding Regulatory Elements Using Joint Likelihoods for Sequence and Expression Profile Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
An integrated statistical comparative analysis between variant genetic datasets of Mus musculus
International Journal of Computational Intelligence in Bioinformatics and Systems Biology
Hi-index | 0.00 |
Regulatory sequence elements provide important cluesto understanding and predicting gene expression. Althoughthe binding sites for hundreds of transcription factorsare known, there has been no systematic attempt toincorporate this information in the annotation of the humangenome. Cross species sequence comparisons arecritical to a meaningful annotation of regulatory elementssince they generally reside in conserved non-coding regions.To take advantage of the recently completed draftsof the mouse and human genomes for annotating transcriptionfactor binding sites, we developed SMASH, a computationalpipeline that identifies thousands of orthologous human/mouse proteins, maps them to genomic sequences, extractsand compares upstream regions and annotates putativeregulatory elements in conserved, non-coding, upstreamregions. Our current dataset consists of approximately2500 human/mouse gene pairs. Transcription startsites were estimated by mapping quasi-full length cDNA sequences.SMASH uses a novel probabilistic method to identifyputative conserved binding sites that takes into accountthe competition between transcription factors for bindingDNA. SMASH presents the results via a genome browserweb interface which displays the predicted regulatory informationtogether with the current annotations for the humangenome. Our results are validated by comparison to previouslypublished experimental data. SMASH results comparefavorably to other existing computational approaches.