LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Finding Intensional Knowledge of Distance-Based Outliers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Distance-based outliers: algorithms and applications
The VLDB Journal — The International Journal on Very Large Data Bases
A Projection Pursuit Algorithm for Exploratory Data Analysis
IEEE Transactions on Computers
Algorithmic fusion of gene expression profiling for diffuse large B-cell lymphoma outcome prediction
IEEE Transactions on Information Technology in Biomedicine
Hi-index | 0.10 |
We study the problem of how to assess the reliability of a statistical measurement on data set containing unknown quantity of noises, inconsistencies, and outliers. A practical approach that analyzes the dynamical patterns (trends) of the statistical measurements through a sequential extreme-boundary-points (EBP) weed-out process is explored. We categorize the weed-out trend patterns (WOTP) and examine their relation to the reliability of the measurement. The approach is applied to the processes of extracting genes that are predictive to BCL2 translocations and to clinical survival outcomes of diffuse large B-cell lymphoma (DLBCL) from DNA Microarray gene expression profiling data sets. Fisher's Discriminate Criterion (FDC) is used as a statistical measurement in the processes. It is found that the weed-out trend analysis (WOTA) approach is effective for qualitatively assessing the statistics-based measurements in the experimentations conducted.