Density Ratio Estimation: A New Versatile Tool for Machine Learning
ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
A Least-squares Approach to Direct Importance Estimation
The Journal of Machine Learning Research
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
We propose a new statistical approach to the problem of inlier-based outlier detection, i.e.,finding outliers in the test set based on the training set consisting only of inliers. Our key idea is to use the ratio of training and test data densities as an outlier score; we estimate the ratio directly in a semi-parametric fashion without going through density estimation. Thus our approach is expected to have better performance in high-dimensional problems. Furthermore, the applied algorithm for density ratio estimation is equipped with a natural cross-validation procedure, allowing us to objectively optimize the value of tuning parameters such as the regularization parameter and the kernel width. The algorithm offers a closed-form solution as well as a closed-form formula for the leave-one-out error. Thanks to this, the proposed outlier detection me thod is computationally very efficient and is scalable to massive datasets. Simulations with benchmark and real-world datasets illustrate the usefulness of the proposed approach.