Outlier detection and least trimmed squares approximation using semi-definite programming

  • Authors:
  • T. D. Nguyen;R. Welsch

  • Affiliations:
  • 3103 Newmark Civil Engineering Laboratory, UIUC, 205 N. Mathews Ave. Urbana, IL 61801, United States;Sloan School of Management and Center for Computational Research in Economics and Management Science, MIT, E53-383, Cambridge, MA 02139, United States

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2010

Quantified Score

Hi-index 0.03

Visualization

Abstract

Robust linear regression is one of the most popular problems in the robust statistics community. It is often conducted via least trimmed squares, which minimizes the sum of the k smallest squared residuals. Least trimmed squares has desirable properties and forms the basis on which several recent robust methods are built, but is very computationally expensive due to its combinatorial nature. It is proven that the least trimmed squares problem is equivalent to a concave minimization problem under a simple linear constraint set. The ''maximum trimmed squares'', an ''almost complementary'' problem which maximizes the sum of the q smallest squared residuals, in direct pursuit of the set of outliers rather than the set of clean points, is introduced. Maximum trimmed squares (MTS) can be formulated as a semi-definite programming problem, which can be solved efficiently in polynomial time using interior point methods. In addition, under reasonable assumptions, the maximum trimmed squares problem is guaranteed to identify outliers, no mater how extreme they are.