Using IBM content manager for genomic data annotation and quality assurance tasks

  • Authors:
  • H. Huang;J. Lu;W. B. Hunter;S. Liang

  • Affiliations:
  • School of Information, University of South Florida, Tampa, FL;Center for Viticulture and Small Fruit Research, Florida A&M University, Tallahassee, FL;U.S. Department of Agriculture, Agriculture Research Service, U.S. Horticultural Research Laboratory, Fort Pierce, FL;College of Medicine, Southern Medical University, Guangzhou, China

  • Venue:
  • IBM Journal of Research and Development
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the amount of heterogeneous genomic data and related annotations continues to grow, a flexible and easy-to-access data management solution is required to integrate such data and diverse annotation tasks. This preliminary report describes the benefits of using IBM DB2® Content Manager software by conducting task-oriented grape genome annotations, along with data quality-assurance checks throughout the annotation process. To demonstrate the usability of this application, we describe the implementation of two real-life content-based genome annotation case scenarios: 1) expressed sequence tags annotation; and 2) sequence annotation related to simple sequence repeat markers. The IBM DB2 Content Manager allows users to easily construct content-based genomic information applications as rapidly built and readily adapted customized content documents with attributes within an easy-to-use interface system. Users can simultaneously conduct the annotation quality checks while making annotations by utilizing a built-in standardized data quality-control assurance procedure referred to as annotation "routing." The system provides search features or cross-links with different annotation contents or data formats. The data quality workflow and procedure within the system also resulted in accuracy and consistency in the data annotation and curation lifecycle.