Ontology concepts and tools for statistical genomics

  • Authors:
  • Vincent J. Carey

  • Affiliations:
  • Department of Medicine, Channing Laboratory, Harvard Medical School, 181 Longwood Ave. Boston, MA

  • Venue:
  • Journal of Multivariate Analysis
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In computer science, an ontology is any formally structured vocabulary covering a conceptual domain. Gene Ontology (GO) is a structured collection of terms defining biological processes, cellular components, or molecular functions for the purpose of characterizing gene products and functions. The structure of GO is a directed acyclic graph (DAG) with typed edges. We describe a simple formalism for working with ontologies for statistical purposes, and define object-ontology complexes, which encode the usage of the vocabulary to label objects under analysis. Recently developed concepts of information content and semantic similarity are evaluated and used to explore the association between LocusLink loci and GO. We investigate relations between GO DAG structure, association evidence codes and term information content, illustrate computation of semantic similarities of genes within and between clusters discovered in a microarray, and describe a more general ontology and its use in inference on genetic network structure.