Self-Organizing Maps of Position Weight Matrices for Motif Discovery in Biological Sequences
Artificial Intelligence Review
Motif discoveries in unaligned molecular sequences using self-organizing neural networks
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
In this paper, we examined the problem of identifying motifs in DNA sequences. Transcription-binding sites, which are functionally significant subsequences, are considered as motifs. In order to reveal such DNA motifs, our method makes use of Fuzzy clustering of Position Weight Matrix. The Fuzzy C-Means (FCM) algorithm clearly predicted known motifs that existed in intergenic regions of GAL4, CBF1 and GCN4 DNA sequences. This paper also provides a comparison of FCM with some clustering methods such as Self-Organizing Map and K-Means. The results of the FCM algorithm is compared to the results of popular motif discovery tool Multiple Expectation Maximization for Motif Elicitation (MEME) as well. We conclude that soft-clustering-based machine learning methods such as FCM are useful to finding patterns in biological sequences.