Comparison of visualization methods of genome-wide SNP profiles in childhood acute lymphoblastic leukaemia

Authors:
Ahmad Al-Oqaily;Paul J. Kennedy;Daniel Catchpoole;Simeon Simoff
Affiliations:
University of Technology, Sydney, Broadway, NSW, Australia;University of Technology, Sydney, Broadway, NSW, Australia;The Children's Hospital at Westmead, Westmead NSW, Australia;University of Western Sydney, Parammata, Australia
Venue:
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
Year:
2008

Citing 12
Cited 1

Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Neighborhood Preservation in Nonlinear Projection Methods: An Experimental Study

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Semi-Supervised Learning on Riemannian Manifolds

Machine Learning
Face Recognition Using Laplacianfaces

IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Orthogonal Neighborhood Preserving Projections

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
2005 Special Issue: Improving dimensionality reduction with spectral gradient descent

Neural Networks - 2005 Special issue: IJCNN 2005
Local multidimensional scaling

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
Data Analysis and Visualization in Genomics and Proteomics

Data Analysis and Visualization in Genomics and Proteomics
Orthogonal Neighborhood Preserving Projections: A Projection-Based Dimensionality Reduction Technique

IEEE Transactions on Pattern Analysis and Machine Intelligence
Comparison of visualization methods for an atlas of gene expression data sets

Information Visualization
Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets

IEEE Transactions on Neural Networks

Improving classifications for cardiac autonomic neuropathy using multi-level ensemble classifiers and feature selection based on random forest

AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining and knowledge discovery have been applied to datasets in various industries including biomedical data. Modelling, data mining and visualization in biomedical data address the problem of extracting knowledge from large and complex biomedical data. The current challenge of dealing with such data is to develop statistical-based and data mining methods that search and browse the underlying patterns within the data. In this paper, we employ several data reduction methods for visualizing genome--wide Single Nucleotide Polymorphism (SNP) datasets based on state--of--art data reduction techniques. Visualization approach has been selected based on the trustworthiness of the resultant visualizations. To deal with large amounts of genetic variation data, we have chosen to apply different data reduction methods to deal with the problem induced by high dimensionality. Based on the trustworthiness metric we found that neighbour Retrieval Visualizer (NeRV) outperformed other methods. This method optimizes the retrieval quality of Stochastic neighbour Embedding. The quality measure of the visualization (i.e. NeRV) showed excellent results, even though the dataset was reduced from 13917 to 2 dimensions. The visualization results will assist clinicians and biomedical researchers in understanding the systems biology of patients and how to compare different groups of clusters in visualizations.