Geometric visualization of TF binding sites in context

  • Authors:
  • Chih Lee;Chun-Hsi Huang

  • Affiliations:
  • University of Connecticut, Storrs, CT;University of Connecticut, Storrs, CT

  • Venue:
  • Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sequence logo is a widely used tool for visualizing a sequence motif, the pattern shared by a set of short sequences. It gives us a clear picture of the motif as a whole, displaying the base composition and information content at each position. It however does not allow us to visualize a set of "positive" short sequences in the context of a set of "negative" ones. In this work, we address this issue by a geometric approach using binding sites and non-binding sites of a TF. In particular, we propose to use Fisher's discriminant analysis to identify axes for projecting short sequences onto a 2-dimensional Euclidean space. We showed that, in addition to visualization, the proposed approach affords discovery of better measures of similarity between two short sequences in the TF binding site search problem. Moreover, coupled with a clustering algorithm, this novel technique can be used for motif subtype identification as well as visualization. Finally, we argue that our novel approach can be used side by side with sequence logo for a wide variety of purposes.