Detecting LTR structures in human genomic sequences using profile hidden Markov models

  • Authors:
  • Li-Ching Wu;Hsien-Da Huang;Yu-Chung Chang;Ying-Chun Lee;Jorng-Tzong Horng

  • Affiliations:
  • Institute of System Biology and Bioinformatics, National Central University, Taiwan;Department of Biological Science and Technology, Institute of Bioinformatic National Chiao-Tung University, Hsin-Chu, Taiwan;Department of Biotechnology, Ming Chuan University, Taiwan;Department of Computer Science and Information Engineering, National Central University, Taiwan;Institute of System Biology and Bioinformatics, National Central University, Taiwan and Department of Computer Science and Information Engineering, National Central University, Taiwan

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

More than 45% of human genome has been annotated as transposable elements (TEs). The human genome is expanded by the mobilization of these TEs, which they may increase the plasticity and variation of the genome. Long terminal repeat (LTR) retrotransposons are important components in TEs. LTRs include regulatory sites, which the authors believe could be conserved in evolution. Therefore, these significant motifs in the sequence of LTRs are found and are used to train a Hidden Markov Model. These models are used as fingerprints to detect most of the known LTRs detected by RepeatMasker. LTR instances are classified into families using the predictive models proposed. These LTRs can support evolutionary analysis. A new method of detecting LTR is proposed. Analyzing LTR sequences reveals some specific motifs as LTR fingerprints, which can be built into HMM profiles. Experimental results reveal that the proposed experimental approach not only discovers most of the LTRs found by RepeatMasker, but also detects some novel LTRs. Moreover, the novel LTRs may be structurally incomplete or degenerate.