PrimerDB: a synthetic database for primer/ oligonucleotide hybridization and efficiency prediction

Authors:
Randeep Singh;Sunil Kumar;Lalit Gupta
Affiliations:
Philips Research Asia - Bangalore (PRA-B);Philips Research Asia - Bangalore (PRA-B);Philips Research Asia - Bangalore (PRA-B)
Venue:
BioMED '08 Proceedings of the Sixth IASTED International Conference on Biomedical Engineering
Year:
2008

Citing 1
Cited 0

An overview of statistical learning theory

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Primer/ oligonucleotide design is the most important step for any array or PCR based assay. There are several parameters that affect the outcome of the whole assay. One of the most important parameter that affects the hybridization of the probe with the target sequence is its free energy. In the present work, we have designed primers for genes associated with 30 monogenic disorders using Primer3. The database were created with free energy parameters and rating of the sequences was obtained through modified Net- Primer where the secondary structure parameter prediction was made more robust. In addition to these parameters, we have calculated various other sequence features including the fraction of A, T, G and C, position of dimers and trimers. The PrimerDB database will be made public and it can be used for mutation detection, cloning and sequencing for these 30 monogenic disorders. We obtained around one million primers/ oligonucleotides for the given 30 genes. These sequences were then grouped into two categories; one having rating of more then 90 and other less than 90. After extraction of various features these sequences were subjected to feature selection algorithms to obtain the most important and remove the redundant parameters (features). The most important features were then classified through Support Vector Machines (SVM) where the overall efficiency of the data obtained was 70%. The results presented in this paper are preliminary and further investigations are being carried out on the sequences to extract more features and to increase the classification accuracy.