Assisted descriptor selection based on visual comparative data analysis

Authors:
Sebastian Bremm;Tatiana von Landesberger;Jürgen Bernard;Tobias Schreck
Affiliations:
Technische Universität Darmstadt, Germany;Technische Universität Darmstadt and 2Fraunhofer Institute for Computer Graphics Research, Darmstadt, Germany;Technische Universität Darmstadt, Germany;Technische Universität Darmstadt, Germany
Venue:
EuroVis'11 Proceedings of the 13th Eurographics / IEEE - VGTC conference on Visualization
Year:
2011

Citing 20
Cited 1

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Self-Organizing Maps

Self-Organizing Maps
When Is ''Nearest Neighbor'' Meaningful?

ICDT '99 Proceedings of the 7th International Conference on Database Theory
On the Surprising Behavior of Distance Metrics in High Dimensional Spaces

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Feature Selection for Clustering - A Filter Solution

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
A SOM Based Cluster Visualization and Its Application for False Coloring

IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 3 - Volume 3
Feature Selection for Unsupervised Learning

The Journal of Machine Learning Research
Automatic Feature Extraction for Classifying Audio Data

Machine Learning
A decade of progress in indexing and mining large time series databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
High-Dimensional Visual Analytics: Interactive Exploration Guided by Pairwise Views of Point Distributions

IEEE Transactions on Visualization and Computer Graphics
Content-Based 3D Object Retrieval

IEEE Computer Graphics and Applications
EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Interactive Dimensionality Reduction Through User-defined Combinations of Quality Metrics

IEEE Transactions on Visualization and Computer Graphics
Interactive optimization for steering machine classification

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The feature selection problem: traditional methods and a new algorithm

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Techniques for precision-based visual analysis of projected data

Information Visualization - Special issue on selected papers from visualization and data analysis 2010
Clustering of the self-organizing map

IEEE Transactions on Neural Networks
Selecting good views of high-dimensional data using class consistency

EuroVis'09 Proceedings of the 11th Eurographics / IEEE - VGTC conference on Visualization

Content-based layouts for exploratory metadata search in scientific research data

Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

Exploration and selection of data descriptors representing objects using a set of features are important components in many data analysis tasks. Usually, for a given dataset, an optimal data description does not exist, as the suitable data representation is strongly use case dependent. Many solutions for selecting a suitable data description have been proposed. In most instances, they require data labels and often are black box approaches. Non-expert users have difficulties to comprehend the coherency of input, parameters, and output of these algorithms. Alternative approaches, interactive systems for visual feature selection, overburden the user with an overwhelming set of options and data views. Therefore, it is essential to offer the users a guidance in this analytical process. In this paper, we present a novel system for data description selection, which facilitates the user's access to the data analysis process. As finding of suitable data description consists of several steps, we support the user with guidance. Our system combines automatic data analysis with interactive visualizations. By this, the system provides a recommendation for suitable data descriptor selections. It supports the comparison of data descriptors with differing dimensionality for unlabeled data. We propose specialized scores and interactive views for descriptor comparison. The visualization techniques are scatterplot-based and grid-based. For the latter case, we apply Self-Organizing Maps as adaptive grids which are well suited for large multi-dimensional data sets. As an example, we demonstrate the usability of our system on a real-world biochemical application.