Bi-level clustering of mixed categorical and numerical biomedical data

Authors:
Bill Andreopoulos;Aijun An;Xiaogang Wang
Affiliations:
Department of Computer Science and Engineering, York University, Toronto, Ontario M3J 1P3, Canada.;Department of Computer Science and Engineering, York University, Toronto, Ontario M3J 1P3, Canada.;Department of Mathematics and Statistics, York University, Toronto, Ontario M3J 1P3, Canada
Venue:
International Journal of Data Mining and Bioinformatics
Year:
2006

Citing 15
Cited 2

The nature of statistical learning theory

The nature of statistical learning theory
Bayesian classification (AutoClass): theory and results

Advances in knowledge discovery and data mining
CACTUS—clustering categorical data using summaries

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Class prediction and discovery using gene expression data

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
ROCK: a robust clustering algorithm for categorical attributes

Information Systems
Clustering Algorithms

Clustering Algorithms
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Techniques of Cluster Algorithms in Data Mining

Data Mining and Knowledge Discovery
Clustering Categorical Data: An Approach Based on Dynamical Systems

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
A survey of data mining and knowledge discovery software tools

ACM SIGKDD Explorations Newsletter
Clustering mixed numerical and low quality categorical data: significance metrics on a yeast example

Proceedings of the 2nd international workshop on Information quality in information systems
A toolbox for learning from relational data with propositional and multi-instance learners

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
A fuzzy k-modes algorithm for clustering categorical data

IEEE Transactions on Fuzzy Systems

Integrating flexibility and interactivity in bioinformatics visual programming tools with Focus+Context algorithm

International Journal of Data Mining and Bioinformatics
Weighted topological clustering for categorical data

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I

Quantified Score

Hi-index	0.01

Visualization

Abstract

Biomedical data sets often have mixed categorical and numerical types, where the former represent semantic information on the objects and the latter represent experimental results. We present the BILCOM algorithm for 'Bi-Level Clustering of Mixed categorical and numerical data types'. BILCOM performs a pseudo-Bayesian process, where the prior is categorical clustering. BILCOM partitions biomedical data sets of mixed types, such as hepatitis, thyroid disease and yeast gene expression data with Gene Ontology annotations, more accurately than if using one type alone.