Feature-guided clustering of multi-dimensional flow cytometry datasets

Authors:
Qing T. Zeng;Juan Pablo Pratt;Jane Pak;Dino Ravnic;Harold Huss;Steven J. Mentzer
Affiliations:
Decision Systems Group, Brigham and Women's Hospital, 310 Thorn Building, 75 Francis Street, Harvard Medical School, Boston, MA 02115, USA and Harvard-MIT Division of Human Sciences and Technology ...;Departments of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA;Departments of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA;Departments of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA;Departments of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA;Departments of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA and Harvard-MIT Division of Human Sciences and Technology, Cambridge, MA, USA
Venue:
Journal of Biomedical Informatics
Year:
2007

Citing 8
Cited 2

An on-line agglomerative clustering method for nonstationary data

Neural Computation
Stability-based validation of clustering solutions

Neural Computation
Evaluation and optimization of clustering in gene expression data analysis

Bioinformatics
A Modified K-Means Algorithm for Circular Invariant Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Genetic Algorithm Using Hyper-Quadtrees for Low-Dimensional K-means Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
2005 Special Issue: Efficient streaming text clustering

Neural Networks - 2005 Special issue: IJCNN 2005
Validity-guided (re)clustering with applications to image segmentation

IEEE Transactions on Fuzzy Systems
Unsupervised multistage image classification using hierarchical clustering with a bayesian similarity measure

IEEE Transactions on Image Processing

Bayesian clustering of flow cytometry data for the diagnosis of B-Chronic Lymphocytic Leukemia

Journal of Biomedical Informatics
Discrimination of malignant neutrophils of chronic myelogenous leukemia from normal neutrophils by support vector machine

Computers in Biology and Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Background: Flow cytometry produces large multi-dimensional datasets of the physical and molecular characteristics of individual cells. The objective of this study was to simplify the cytometry datasets by arranging or clustering ''objects'' (cells) into a smaller number of relatively homogeneous groups (clusters) on the basis of interobject similarities and dissimilarities. Results: The algorithm was designed to be driven by histogram features; that is, the relevant single parameter histogram features were used to guide multidimensional k-means clustering without an a priori estimate of cluster number. To test this approach, we simulated cell-derived datasets using protein-coated microspheres (artificial ''cells''). The microspheres were constructed to provide 119 populations in 40 samples. The feature-guided (FG) approach accurately identified 100% of the predetermined cluster combinations. In contrast, an approach based on the partition index (PI) cluster validity measure accurately identified 83.2% of the clusters. Direct comparisons of the two methods indicated that the FG method was significantly more accurate than PI in identifying both the number of clusters and the number of objects within the clusters (p