The Gene Ontology Categorizer

Authors:
Cliff A. Joslyn;Susan M. Mniszewski;Andy Fulmer;Gary Heaton
Affiliations:
Computer and Computational Sciences, Mail Stop B265, Los Alamos National Laboratory, Los Alamos, NM 87545, USA,;Computer and Computational Sciences, Mail Stop B265, Los Alamos National Laboratory, Los Alamos, NM 87545, USA,;Corporate Biotechnology, Miami Valley Labs;Corporate Functions-IT, Procter & Gamble, Cincinnati, OH 45239-8707, USA
Venue:
Bioinformatics
Year:
2004

Citing 0
Cited 9

CLUGO: A Clustering Algorithm for Automated Functional Annotations Based on Gene Ontology

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Clustering pair-wise dissimilarity data into partially ordered sets

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Combining gene sequence similarity and textual information for gene function annotation in the literature

Information Retrieval
Discovering relations among GO-annotated clusters by Graph Kernel methods

ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Automated methods of predicting the function of biological sequences using GO and rough set

PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Concept lattice representations of annotated taxonomies

CLA'06 Proceedings of the 4th international conference on Concept lattices and their applications
Weighted pseudo-distances for categorization in semantic hierarchies

ICCS'05 Proceedings of the 13th international conference on Conceptual Structures: common Semantics for Sharing Knowledge
Spectral clustering gene ontology terms to group genes by function

WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
Order metrics for semantic knowledge systems

HAIS'10 Proceedings of the 5th international conference on Hybrid Artificial Intelligence Systems - Volume Part II

Quantified Score

Hi-index	3.84

Visualization

Abstract

Summary: The Gene Ontology Categorizer, developed jointly by the Los Alamos National Laboratory and Procter & Gamble Corp., provides a capability for the categorization task in the Gene Ontology (GO): given a list of genes of interest, what are the best nodes of the GO to summarize or categorize that list? The motivating question is from a drug discovery process, where after some gene expression analysis experiment, we wish to understand the overall effect of some cell treatment or condition by identifying 'where' in the GO the differentially expressed genes fall: 'clustered' together in one place? in two places? uniformly spread throughout the GO? 'high', or 'low'? In order to address this need, we view bio-ontologies more as combinatorially structured databases than facilities for logical inference, and draw on the discrete mathematics of finite partially ordered sets (posets) to develop data representation and algorithms appropriate for the GO. In doing so, we have laid the foundations for a general set of methods to address not just the categorization task, but also other tasks (e.g. distances in ontologies and ontology merger and exchange) in both the GO and other bio-ontologies (such as the Enzyme Commission database or the MEdical Subject Headings) cast as hierarchically structured taxonomic knowledge systems.