Robust regression and outlier detection
Robust regression and outlier detection
Algorithms for clustering data
Algorithms for clustering data
Data mining: concepts and techniques
Data mining: concepts and techniques
Detecting pattern-based outliers
Pattern Recognition Letters
Hi-index | 0.00 |
Performance of clustering algorithms is largely dependent on selected similarity measure. Efficiency in handling outliers is a major contributor to the effectiveness of a similarity measure. In the present work, we discuss the problem of handling outliers with different existing similarity measures, and introduce the concepts of a new kind of outliers present in gene expression patterns. We formulate a new similarity, incorporated in Euclidean distance and Pearson correlation coefficient, and then use them in various clustering algorithms to group different gene expression profiles. Assessment of the results are done by using functional annotation. Different existing similarity measures in their traditional form are also used with clustering algorithms for performance comparisons. The results suggest that the new similarity improves performance, in terms of finding biologically relevant groups of genes, of all the considered clustering algorithms.