Statistical Pattern Recognition: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
Advances in high-throughput technology in molecular biology have been producing lots of sequence data on various organisms. Some organisms like virus have various variances in their nucleotide sequences and could be categorized into several subtypes. A sequential pattern which characterizes a subtype and discriminates it from other subtypes is called signature. This paper proposes a method which extracts signature from a collection of sequences data. Based on position specific relative base frequency deference between one subtype data set and the other subtype data set, the proposed method examines discrimination capabilities for the potential signatures. A tool has been developed which implements the proposed method and applied to an experiment to extract signatures for HIV-1 virus subtypes.