High throughput analysis of breast cancer specimens on the grid

  • Authors:
  • Lin Yang;Wenjin Chen;Peter Meer;Gratian Salaru;Michael D. Feldman;David J. Foran

  • Affiliations:
  • Dept. of Electrical and Computer Eng., Rutgers Univ., Piscataway, NJ and Center of Biomedical Imaging and Informatics, The Cancer Institute of New Jersey, UMDNJ-Robert Wood Johnson Medical School, ...;Center of Biomedical Imaging and Informatics, The Cancer Institute of New Jersey, UMDNJ-Robert Wood Johnson Medical School, Piscataway, NJ;Dept. of Electrical and Computer Eng., Rutgers Univ., Piscataway, NJ;Center of Biomedical Imaging and Informatics, The Cancer Institute of New Jersey, UMDNJ-Robert Wood Johnson Medical School, Piscataway, NJ;Dept. of Surgical Pathology, Univ. of Pennsylvania, Philadelphia, PA;Center of Biomedical Imaging and Informatics, The Cancer Institute of New Jersey, UMDNJ-Robert Wood Johnson Medical School, Piscataway, NJ

  • Venue:
  • MICCAI'07 Proceedings of the 10th international conference on Medical image computing and computer-assisted intervention - Volume Part I
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Breast cancer accounts for about 30% of all cancers and 15% of all cancer deaths in women in the United States. Advances in computer assisted diagnosis (CAD) holds promise for early detecting and staging disease progression. In this paper we introduce a Grid-enabled CAD to perform automatic analysis of imaged histopathology breast tissue specimens. More than 100,000 digitized samples (1200 × 1200 pixels) have already been processed on the Grid. We have analyzed results for 3744 breast tissue samples, which were originated from four different institutions using diaminobenzidine (DAB) and hematoxylin staining. Both linear and nonlinear dimension reduction techniques are compared, and the best one (ISOMAP) was applied to reduce the dimensionality of the features. The experimental results show that the Gentle Boosting using an eight node CART decision tree as the weak learner provides the best result for classification. The algorithm has an accuracy of 86.02% using only 20% of the specimens as the training set.