Integration and Mining of Genomic Annotations: Experiences and Perspectives in GFINDer Data Warehousing

  • Authors:
  • Marco Masseroli;Stefano Ceri;Alessandro Campi

  • Affiliations:
  • Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milano, Italy 20133;Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milano, Italy 20133;Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milano, Italy 20133

  • Venue:
  • DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many tasks in bioinformatics require the comprehensive evaluation of different types of data, generally available in distributed and heterogeneous data sources. Several approaches, including federated databases, multi databases and mediator based systems, have been proposed to integrate data from multiple sources. Yet, data warehousing seams to be the most adequate when numerous data need to be integrated, efficiently processed, and mined comprehensively. To support biological interpretation of high-throughput gene lists, we previously developed GFINDer (Genome Functional INtegrated Discoverer, http://www.bioinformatics.polimi.it/GFINDer/), a web server that statistically analyzes and mines functional and phenotypic gene annotations sparsely available in numerous databanks to highlight annotation categories significantly enriched or depleted in the considered gene lists. GFINDer includes a data warehouse that integrates gene and protein annotations of several organisms expressed through various controlled terminologies and ontologies. Here, we describe GFINDer data warehouse and discuss the lessons learned in its construction and five-year maintenance and development.