Assessment of clustering algorithms for unsupervised transcription factor binding site discovery

Authors:
Mustafa Karabulut;Turgay Ibrikci
Affiliations:
Vocational High School, Gaziantep University, 27310 Gaziantep, Turkey;Department of Electrical-Electronics Engineering, Çukurova University, 01330 Adana, Turkey
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 4
Cited 0

Comparative analysis of methods for representing and searching for transcription factor binding sites

Bioinformatics
Self-Organizing Maps of Position Weight Matrices for Motif Discovery in Biological Sequences

Artificial Intelligence Review
Motif discovery for proteins using subsequence clustering

Proceedings of the 5th international workshop on Bioinformatics
MotifVoter

Bioinformatics

Quantified Score

Hi-index	12.05

Visualization

Abstract

Identification of transcription factor binding sites is a key task to understand gene regulation mechanism to discover gene networks and functions. Clustering approach is proved to be useful when finding such patterns residing in promoter regions of co-regulated genes. Four clustering algorithms, Self-Organizing Map, K-Means, Fuzzy C-Means and Expectation-Maximization are studied in this paper to discover motifs in datasets extracted from Saccharomyces cerevisiae, Escherichia coli, Droshophila melanogaster and Homo sapiens DNA sequences. Required modifications to clustering algorithms in order to adapt them to motif finding task are presented through the paper. Then, their motif-finding performances are discussed carefully and evaluated against a popular motif-finding method, MEME.