Improving mixture tree construction using better EM algorithms

  • Authors:
  • Shu-Chuan (Grace) Chen;Bruce Lindsay

  • Affiliations:
  • -;-

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2014

Quantified Score

Hi-index 0.03

Visualization

Abstract

This paper is concerned with hierarchical clustering of long binary sequence data. We propose two alternative improvements of the EM algorithm used in Chen and Lindsay (2006). One is the FixEM. It is just the regular EM but we no longer update the weights @ps used in the ancestral mixture models. The other is the ModalEM. In this we cluster data according to the modes of an estimated density function for the data. In order to compare these methods with each other and other popular hierarchical clustering methods, we use a data example from the international HapMap project. We compare the speed and the ability of these methods to separate out true clusters. In addition, simulation studies are performed to compare the efficiency and accuracy of these methods. Our conclusion is that the new EM methods are far superior to the original, and that both provide useful alternatives to other standard clustering methods.