Identification of breast cancer subtypes using multiple gene expression microarray datasets

  • Authors:
  • Alexandre Mendes

  • Affiliations:
  • Centre for Bioinformatics, Biomarker Discovery and Information-Based Medicine School of Electrical Engineering and Computer Science Faculty of Engineering and Built Environment, The University of ...

  • Venue:
  • AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work is motivated by the need for consensus clustering methods using multiple datasets, applicable to microarray data. It introduces a new method for clustering samples with similar genetic profiles, in an unsupervised fashion, using information from two or more datasets. The method was tested using two breast cancer gene expression microarray datasets, with 295 and 249 samples; and 12,325 common genes. Four subtypes with similar genetic profiles were identified in both datasets. Clinical information was analysed for the subtypes found and they confirmed different levels of tumour aggressiveness, measured by the time of metastasis, thus indicating a connection between different genetic profiles and prognosis. Finally, the subtypes identified were compared to already established subtypes of breast cancer. That indicates that the new approach managed to detect similar gene expression profile patterns across the two datasets without any a priori knowledge. The two datasets used in this work, as well as all the figures, are available for download from the website http://www.cs.newcastle.edu.au/˜mendes/BreastCancer.html.