Splice sites prediction of Human genome using length-variable Markov model and feature selection

Authors:
Quanwei Zhang;Qinke Peng;Qi Zhang;Yanhua Yan;Kankan Li;Jing Li
Affiliations:
Xi'an Jiatong University, State Key Laboratory for Manufacturing Systems Engineering, School of Electronic and Information Engineering, Xi'an, China;Xi'an Jiatong University, State Key Laboratory for Manufacturing Systems Engineering, School of Electronic and Information Engineering, Xi'an, China;Xi'an Jiatong University, State Key Laboratory for Manufacturing Systems Engineering, School of Electronic and Information Engineering, Xi'an, China;Xi'an Jiatong University, State Key Laboratory for Manufacturing Systems Engineering, School of Electronic and Information Engineering, Xi'an, China;Xi'an Jiatong University, State Key Laboratory for Manufacturing Systems Engineering, School of Electronic and Information Engineering, Xi'an, China;Xi'an Jiatong University, State Key Laboratory for Manufacturing Systems Engineering, School of Electronic and Information Engineering, Xi'an, China
Venue:
Expert Systems with Applications: An International Journal
Year:
2010

Citing 6
Cited 0

Effective hidden Markov models for detecting splicing junction sites in DNA sequences

Information Sciences: an International Journal
Markov Encoding for Detecting Signals in Genomic Sequences

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Prediction of splice sites with dependency graphs and their expanded bayesian networks

Bioinformatics
Importance of RNA secondary structure information for yeast donor and acceptor splice site predictions by neural networks

Computational Biology and Chemistry
Splice site prediction using support vector machines with a Bayes kernel

Expert Systems with Applications: An International Journal
Understanding protein structure prediction using SVM_DT

ISPA'05 Proceedings of the 2005 international conference on Parallel and Distributed Processing and Applications

Quantified Score

Hi-index	12.05

Visualization

Abstract

As the rapid increase of DNA sequences, there is a crucial need of effective methods to detect genes and genes' structures, among which splice sites prediction plays a key role. There are conservative segments on the junctions between introns and exons, which can help us predict splice sites by computational methods; however it is unclear that which nucleotides contribute to the splicing process, so it is necessary to select a suitable set of features to accomplish the prediction of splice sites. A length-variable Markov model is proposed in this paper. By the length-variable model, a suitable subset of features can be chosen as the detecting features for each candidate splice site according to the ratio of likelihood at each position. The results of our experiments show that our models not only achieve higher prediction accuracy than the basic Markov model and some present methods, but also preserve the feature of low time cost as the basic Markov model does.