Algorithms and complexity for least median of squares regression
Discrete Applied Mathematics
Robust regression and outlier detection
Robust regression and outlier detection
Robust regression methods for computer vision: a review
International Journal of Computer Vision
SIAM Journal on Scientific Computing
The feasible solution algorithm for least trimmed squares regression
Computational Statistics & Data Analysis
Improved feasible solution algorithms for high breakdown estimation
Computational Statistics & Data Analysis
BIRCH: A New Data Clustering Algorithm and Its Applications
Data Mining and Knowledge Discovery
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An empirical analysis of software effort estimation with outlier elimination
Proceedings of the 4th international workshop on Predictor models in software engineering
Multivariate generalized S-estimators
Journal of Multivariate Analysis
A procedure for robust fitting in nonlinear regression
Computational Statistics & Data Analysis
Finding approximate solutions to combinatorial problems with very large data sets using BIRCH
Computational Statistics & Data Analysis
A kernel hat matrix based rejection criterion for outlier removal in support vector regression
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
A relaxed approach to combinatorial problems in robustness and diagnostics
Statistics and Computing
Bounded influence support vector regression for robust single-model estimation
IEEE Transactions on Neural Networks
OWA operators in regression problems
IEEE Transactions on Fuzzy Systems
Computational Statistics & Data Analysis
Imputation of missing values for compositional data using classical and robust methods
Computational Statistics & Data Analysis
Outlier detection and least trimmed squares approximation using semi-definite programming
Computational Statistics & Data Analysis
An evolutionary algorithm for robust regression
Computational Statistics & Data Analysis
Semiparametrically weighted robust estimation of regression models
Computational Statistics & Data Analysis
Robust diagnostics for the heteroscedastic regression model
Computational Statistics & Data Analysis
Adaptive Modeling of Analog/RF Circuits for Efficient Fault Response Evaluation
Journal of Electronic Testing: Theory and Applications
The least trimmed quantile regression
Computational Statistics & Data Analysis
Benchmark testing of algorithms for very robust regression: FS, LMS and LTS
Computational Statistics & Data Analysis
ICICA'11 Proceedings of the Second international conference on Information Computing and Applications
A novel approach for the registration of weak affine images
Pattern Recognition Letters
Multi-Objective Genetic Algorithm for Robust Clustering with Unknown Number of Clusters
International Journal of Applied Evolutionary Computation
On the value of outlier elimination on software effort estimation research
Empirical Software Engineering
Information and Software Technology
An Adversarial Optimization Approach to Efficient Outlier Removal
Journal of Mathematical Imaging and Vision
An approach to the mean shift outlier model by Tikhonov regularization and conic programming
Intelligent Data Analysis - Business Analytics and Intelligent Optimization
Hi-index | 0.00 |
Data mining aims to extract previously unknown patterns or substructures from large databases. In statistics, this is what methods of robust estimation and outlier detection were constructed for, see e.g. Rousseeuw and Leroy (1987). Here we will focus on least trimmed squares (LTS) regression, which is based on the subset of h cases (out of n) whose least squares fit possesses the smallest sum of squared residuals. The coverage h may be set between n/2 and n. The computation time of existing LTS algorithms grows too much with the size of the data set, precluding their use for data mining. In this paper we develop a new algorithm called FAST-LTS. The basic ideas are an inequality involving order statistics and sums of squared residuals, and techniques which we call `selective iteration' and `nested extensions'. We also use an intercept adjustment technique to improve the precision. For small data sets FAST-LTS typically finds the exact LTS, whereas for larger data sets it gives more accurate results than existing algorithms for LTS and is faster by orders of magnitude. This allows us to apply FAST-LTS to large databases.