Gene ontology annotation as text categorization: An empirical study
Information Processing and Management: an International Journal
Automated comparative auditing of NCIT genomic roles using NCBI
Journal of Biomedical Informatics
Towards automatic generation of gene summary
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
On the Reachability of Trustworthy Information from Integrated Exploratory Biological Queries
DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
Software testing and the naturally occurring data assumption in natural language processing
SETQA-NLP '08 Software Engineering, Testing, and Quality Assurance for Natural Language Processing
Guest Editorial: Current issues in biomedical text mining and natural language processing
Journal of Biomedical Informatics
Gene Functional Annotation with Dynamic Hierarchical Classification Guided by Orthologs
DS '09 Proceedings of the 12th International Conference on Discovery Science
Ontology consolidation in bioinformatics
APCCM '10 Proceedings of the Seventh Asia-Pacific Conference on Conceptual Modelling - Volume 110
Enabling annotation provenance in bioinformatics workflow applications
BSB'10 Proceedings of the Advances in bioinformatics and computational biology, and 5th Brazilian conference on Bioinformatics
Application of semantic kernels to literature-based gene function annotation
DS'11 Proceedings of the 14th international conference on Discovery science
Towards automatic pathway generation from biological full-text publications
IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Improving data quality by source analysis
Journal of Data and Information Quality (JDIQ)
Mixture of logistic models and an ensemble approach for protein-protein interaction extraction
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Mining protein-protein interactions from GeneRIFs with OpenDMAP
ISMB/ECCB'09 Proceedings of the 2009 workshop of the BioLink Special Interest Group, international conference on Linking Literature, Information, and Knowledge for Biology
Journal of Biomedical Informatics
Journal of Biomedical Informatics
Relation mining experiments in the pharmacogenomics domain
Journal of Biomedical Informatics
Visualizing the protein sequence universe
Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences
High performance computing workflow for protein functional annotation
Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery
Comparative meta-analysis between human and mouse cancer microarray data reveals critical pathways
International Journal of Data Mining and Bioinformatics
Hi-index | 3.84 |
Motivation: Knowledge base construction has been an area of intense activity and great importance in the growth of computational biology. However, there is little or no history of work on the subject of evaluation of knowledge bases, either with respect to their contents or with respect to the processes by which they are constructed. This article proposes the application of a metric from software engineering known as the found/fixed graph to the problem of evaluating the processes by which genomic knowledge bases are built, as well as the completeness of their contents. Results: Well-understood patterns of change in the found/fixed graph are found to occur in two large publicly available knowledge bases. These patterns suggest that the current manual curation processes will take far too long to complete the annotations of even just the most important model organisms, and that at their current rate of production, they will never be sufficient for completing the annotation of all currently available proteomes. Contact: larry.hunter@uchsc.edu