Dynamic queries for information exploration: an implementation and evaluation
CHI '92 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Self-organizing maps
A kernel view of the dimensionality reduction of manifolds
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets
IEEE Transactions on Neural Networks
Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization
The Journal of Machine Learning Research
Visualising the structure of document search results: a comparison of graph theoretic approaches
Information Visualization
Visual analytics of clinical and genetic datasets of acute lymphoblastic leukaemia
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
Using self organizing maps to find good comparison universities
PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
Kernel: based visualisation of genes with the gene ontology
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
Engineering Applications of Artificial Intelligence
Supervised Distance Preserving Projections
Neural Processing Letters
Hi-index | 0.00 |
This paper has two intertwined goals: (i) to study the feasibility of an atlas of gene expression data sets as a visual interface to expression databanks, and (ii) to study which dimensionality reduction methods would be suitable for visualizing very high-dimensional data sets. Several new methods have been recently proposed for the estimation of data manifolds or embeddings, but they have so far not been compared in the task of visualization. In visualizations the dimensionality is constrained, in addition to the data itself, by the presentation medium. It turns out that an older method, curvilinear component analysis, outperforms the new ones in terms of trustworthiness of the projections. In a sample databank on gene expression, the main sources of variation were the differences between data sets, different labs, and different measurement methods. This hints at a need for better methods for making the data sets commensurable, in accordance with earlier studies. The good news is that the visualized overview, expression atlas, reveals many of these subsets. Hence, we conclude that dimensionality reduction even from 1339 to 2 can produce a useful interface to gene expression databanks.