Guiding multidimensional analysis using decision trees

  • Authors:
  • Frank van Ham;Martin Petitclerc;Ramon Pisters

  • Affiliations:
  • IBM Canada Ltd., IBM Nederland B.V.;IBM Canada Ltd., IBM Nederland B.V.;IBM Canada Ltd., IBM Nederland B.V.

  • Venue:
  • CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Visualization technology makes it easier for users to spot patterns in data that would be difficult to find using only a computer algorithm. However, the discovery of a particular pattern is often only the first step in any analytical process, with the ultimate goal being insight into the underlying causes of this pattern. In current explorative interfaces, this analytical process often involves iterative hypothesis generation and testing, which gets exponentially more complex and time consuming as the dimensionality of the data set increases. In this paper, we suggest a technique that helps a user generate potential hypotheses for a particular observation or visual feature by reporting correlated dimensions. We use a modified decision tree algorithm that is not tuned for optimal classification, but for broad correlation detection. This paper presents the rationale for, algorithmic improvements in, and performance characteristics of the proposed technique, as well as a prototype implementation into a commercial data analysis tool.