Computing location depth and regression depth in higher dimensions

  • Authors:
  • Peter J. Rousseeuw;Anja Struyf

  • Affiliations:
  • Department of Mathematics and Computer Science, U.I.A., Universiteitsplein 1, B-2610 Antwerp, Belgium;Department of Mathematics and Computer Science, U.I.A., Universiteitsplein 1, B-2610 Antwerp, Belgium

  • Venue:
  • Statistics and Computing
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

The location depth (Tukey 1975) of a point θ relative to a p-dimensional data set Z of size n is defined as the smallest number of data points in a closed halfspace with boundary through θ. For bivariate data, it can be computed in O(nlogn) time (Rousseeuw and Ruts 1996). In this paper we construct an exact algorithm to compute the location depth in three dimensions in O(n2logn) time. We also give an approximate algorithm to compute the location depth in p dimensions in O(mp3+mpn) time, where m is the number of p-subsets used.Recently, Rousseeuw and Hubert (1996) defined the depth of a regression fit. The depth of a hyperplane with coefficients (θ1,…,θp) is the smallest number of residuals that need to change sign to make (θ1,…,θp) a nonfit. For bivariate data (p=2) this depth can be computed in O(nlogn) time as well. We construct an algorithm to compute the regression depth of a plane relative to a three-dimensional data set in O(n2logn) time, and another that deals with p=4 in O(n3logn) time. For data sets with large n and/or p we propose an approximate algorithm that computes the depth of a regression fit in O(mp3+mpn+mnlogn) time. For all of these algorithms, actual implementations are made available.