Monte Carlo comparison of six hierarchical clustering methods on random data
Pattern Recognition
Robust regression and outlier detection
Robust regression and outlier detection
On Rohlf's method for the detection of outliers in multivariate data
Journal of Multivariate Analysis
A minimum spanning tree algorithm with inverse-Ackermann type complexity
Journal of the ACM (JACM)
Computational Statistics & Data Analysis
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
What Is the Nearest Neighbor in High Dimensional Spaces?
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Generating random correlation matrices based on partial correlations
Journal of Multivariate Analysis
ACM Computing Surveys (CSUR)
Graphs, Networks and Algorithms
Graphs, Networks and Algorithms
Hi-index | 0.00 |
One of the most essential topics in robust statistics is the robust estimation of location and covariance. Many popular robust (location and scatter) estimators such as Fast-MCD, MVE, and MZE require at least a convex distribution of the underlying data. In the case of non-convex data distributions these approaches may lead to a suboptimal result caused by the application of Mahalanobis distances with respect to location and covariance of a suitably chosen subsample of the data-implying a convex structure. The approach presented here fixes this drawback using Euclidean distances. The data set is treated as a complete network and the minimum spanning tree (MST) for this data set is calculated. Based on the MST a subset of relevant points (thought of as an ''outlier-free'' subsample of minimum size) is determined which can then be used for calculating data characteristics. It is shown, that the approach has a maximum breakdown point. Additionally, a simulation study provides insights in the approach's behaviour with respect to increasing dimension and size.