Efficient fitting of long-tailed data sets into phase-type distributions

  • Authors:
  • Alma Riska;Vesselin Diev;Evgenia Smirni

  • Affiliations:
  • College of William and Mary, Williamsburg, VA;College of William and Mary, Williamsburg, VA;College of William and Mary, Williamsburg, VA

  • Venue:
  • ACM SIGMETRICS Performance Evaluation Review
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

We propose a new technique for fitting long-tailed data sets into phase-type (PH) distributions. This technique fits data sets with non-monotone densities into a mixture of Erlang and hyperexponential distributions, and data sets with completely monotone densities into hyperexponential distributions. The method first partitions the data set in a divide and conquer fashion and then uses the Expectation-Maximization (EM) algorithm to fit the data of each partition into a PH distribution. The fitting results for each partition are combined to generate the final fitting for the entire data set. The new method is accurate, efficient, and allows one to apply existing analytic tools to analyze the behavior of queueing systems that operate under workloads that exhibit long-tail behavior, such as queues in Internet-related systems.