Correcting MM estimates for "fat" data sets

Authors:
Ricardo A. Maronna;Victor J. Yohai
Affiliations:
Department of Mathematics, School of Exact Sciences, Universidad Nacional de La Plata and C.I.C.B.A., Argentina;Department of Mathematics, School of Exact and Natural Sciences, Universidad de Buenos Aires and CONICET, Argentina
Venue:
Computational Statistics & Data Analysis
Year:
2010

Citing 2
Cited 2

Robust regression and outlier detection

Robust regression and outlier detection
Small sample asymptotics: a review with applications to robust statistics

Computational Statistics & Data Analysis

Editorial: Special issue on variable selection and robust procedures

Computational Statistics & Data Analysis
Sharpening Wald-type inference in robust regression for small samples

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.03

Visualization

Abstract

Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is ''small''. However, many high-dimensional data sets have a ''large'' value of p/n (say, =0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data.