Ensembles of pre-processing techniques for noise detection in gene expression data

  • Authors:
  • Giampaolo L. Libralon;André C. Ponce Leon Ferreira Carvalho;Ana C. Lorena

  • Affiliations:
  • ICMC, USP, São Carlos, SP, Brazil;ICMC, USP, São Carlos, SP, Brazil;Center of Mathematics, Computation and Cognition, ABC Fed. Univ., Santo André, SP, Brazil

  • Venue:
  • ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the imprecise nature of biological experiments, biological data are often characterized by the presence of redundant and noisy data, which are usually derived from errors associated with data collection, such as contaminations in laboratorial samples. Gene expression data represent an example of noisy biological data that suffer from this problem. Machine Learning algorithms have been successfully used in gene expression analysis. Although many Machine Learning algorithms can deal with noise, detecting and removing noisy instances from data can help the induction of the target hypothesis. This paper evaluates the use of distance-based pre-processing techniques in gene expression data, analyzing the effectiveness of these techniques and combinations of them in removing noisy data, measured by the accuracy obtained by different Machine Learning classifiers over the pre-processed data. The results obtained indicate that the pre-processing techniques employed were effective for noise detection.