Automatic cervical cell segmentation and classification in Pap smears

  • Authors:
  • Thanatip Chankong;Nipon Theera-Umpon;Sansanee Auephanwiriyakul

  • Affiliations:
  • -;-;-

  • Venue:
  • Computer Methods and Programs in Biomedicine
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cervical cancer is one of the leading causes of cancer death in females worldwide. The disease can be cured if the patient is diagnosed in the pre-cancerous lesion stage or earlier. A common physical examination technique widely used in the screening is Papanicolaou test or Pap test. In this research, a method for automatic cervical cancer cell segmentation and classification is proposed. A single-cell image is segmented into nucleus, cytoplasm, and background, using the fuzzy C-means (FCM) clustering technique. Four cell classes in the ERUDIT and LCH datasets, i.e., normal, low grade squamous intraepithelial lesion (LSIL), high grade squamous intraepithelial lesion (HSIL), and squamous cell carcinoma (SCC), are considered. The 2-class problem can be achieved by grouping the last 3 classes as one abnormal class. Whereas, the Herlev dataset consists of 7 cell classes, i.e., superficial squamous, intermediate squamous, columnar, mild dysplasia, moderate dysplasia, severe dysplasia, and carcinoma in situ. These 7 classes can also be grouped to form a 2-class problem. These 3 datasets were tested on 5 classifiers including Bayesian classifier, linear discriminant analysis (LDA), K-nearest neighbor (KNN), artificial neural networks (ANN), and support vector machine (SVM). For the ERUDIT dataset, ANN with 5 nucleus-based features yielded the accuracies of 96.20% and 97.83% on the 4-class and 2-class problems, respectively. For the Herlev dataset, ANN with 9 cell-based features yielded the accuracies of 93.78% and 99.27% for the 7-class and 2-class problems, respectively. For the LCH dataset, ANN with 9 cell-based features yielded the accuracies of 95.00% and 97.00% for the 4-class and 2-class problems, respectively. The segmentation and classification performances of the proposed method were compared with that of the hard C-means clustering and watershed technique. The results show that the proposed automatic approach yields very good performance and is better than its counterparts.