Mining structural databases: an evolutionary multi-objetive conceptual clustering methodology

  • Authors:
  • R. Romero-Zaliz;C. Rubio-Escudero;O. Cordón;O. Harari;C. del Val;I. Zwir

  • Affiliations:
  • Dept. Computer Science and Artificial Intelligence, University of Granada, Spain;Dept. Computer Science and Artificial Intelligence, University of Granada, Spain;Dept. Computer Science and Artificial Intelligence, University of Granada, Spain;Dept. Computer Science and Artificial Intelligence, University of Granada, Spain;Dept. Computer Science and Artificial Intelligence, University of Granada, Spain;Dept. Computer Science and Artificial Intelligence, University of Granada, Spain

  • Venue:
  • EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increased availability of biological databases containing representations of complex objects permits access to vast amounts of data. In spite of the recent renewed interest in knowledge-discovery techniques (or data mining), there is a dearth of data analysis methods intended to facilitate understanding of the represented objects and related systems by their most representative features and those relationship derived from these features (i.e., structural data). In this paper we propose a conceptual clustering methodology termed EMO-CC for Evolutionary Multi-Objective Conceptual Clustering that uses multi-objective and multi-modal optimization techniques based on Evolutionary Algorithms that uncover representative substructures from structural databases. Besides, EMO-CC provides annotations of the uncovered substructures, and based on them, applies an unsupervised classification approach to retrieve new members of previously discovered substructures. We apply EMO-CC to the Gene Ontology database to recover interesting substructures that describes problems from different points of view and use them to explain inmuno-inflammatory responses measured in terms of gene expression profiles derived from the analysis of longitudinal blood expression profiles of human volunteers treated with intravenous endotoxin compared to placebo.