Bump hunting in high-dimensional data
Statistics and Computing
Boosted PRIM with Application to Searching for Oncogenic Pathway of Lung Cancer
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Multivariate mode hunting: Data analytic tools with measures of significance
Journal of Multivariate Analysis
Hi-index | 0.00 |
This paper analyzes a data mining/bump hunting technique known as PRIM [1]. PRIM finds regions in high-dimensional input space with large values of a real output variable. This paper provides the first thorough study of statistical properties of PRIM. Amongst others, we characterize the output regions PRIM produces, and derive rates of convergence for these regions. Since the dimension of the input variables is allowed to grow with the sample size, the presented results provide some insight about the qualitative behavior of PRIM in very high dimensions. Our investigations also reveal some shortcomings of PRIM, resulting in some proposals for modifications.