Bi-level clustering of mixed categorical and numerical biomedical data

  • Authors:
  • Bill Andreopoulos;Aijun An;Xiaogang Wang

  • Affiliations:
  • Department of Computer Science and Engineering, York University, Toronto, Ontario M3J 1P3, Canada.;Department of Computer Science and Engineering, York University, Toronto, Ontario M3J 1P3, Canada.;Department of Mathematics and Statistics, York University, Toronto, Ontario M3J 1P3, Canada

  • Venue:
  • International Journal of Data Mining and Bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

Biomedical data sets often have mixed categorical and numerical types, where the former represent semantic information on the objects and the latter represent experimental results. We present the BILCOM algorithm for 'Bi-Level Clustering of Mixed categorical and numerical data types'. BILCOM performs a pseudo-Bayesian process, where the prior is categorical clustering. BILCOM partitions biomedical data sets of mixed types, such as hepatitis, thyroid disease and yeast gene expression data with Gene Ontology annotations, more accurately than if using one type alone.