Tests of automatic annotation using KOG proteins and ESTs from 4 eukariotic organisms

  • Authors:
  • Maurício de Alvarenga Mudado;Estevam Bravo-Neto;José Miguel Ortega

  • Affiliations:
  • Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil;Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil;Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil

  • Venue:
  • BSB'05 Proceedings of the 2005 Brazilian conference on Advances in Bioinformatics and Computational Biology
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

BLAST homology searches have been largely used to annotate function to novel sequences. Secondary databases like KOG can be used in this intention since their sequences have functional classification. We devised an experiment where public ESTs from four eukariotic organisms, which protein sequences are present in the KOG database, are classified to functional KOG categories using tBLASTn. First we assigned the ESTs from one organism to KTL (KOG, TWOG and LSEs) proteins and then we searched the database depleted of the same organism's proteins to simulate a novel transcriptome. Data show that classification was correct (assignment equals annotation) 87.2%, 96.8%, 92.0%, 88.7% for A. thaliana(Ath), C. elegans(Cel), D. melanogaster(Dme) and H. sapiens(Hsa) respectively. We have estimated identity cutoffs for all organisms to use with tBLASTn. These cutoffs trim the same amount of events that a BLASTn in order to minimize false positives in consequence of sequence errors. We found values of 80%, 78%, 78% and 84% for amino-acid identity cutoff for Hsa, Dme, Cel and Ath, respectively. We then evaluated our system by comparing the KTL categories of the assigned ESTs with the KTL categories that the ESTs were classified without the organism's KTL proteins. Moreover, we show the potential of annotation of the KOG database and the ESTs used. Suplementary Information can be found at: http://www.biodados.icb.ufmg.br