Management and display of data analysis environments for large data sets

  • Authors:
  • Robert A. Burnett;Paula J. Cowley;James J. Thomas

  • Affiliations:
  • -;-;-

  • Venue:
  • SSDBM'83 Proceedings of the 2nd international workshop on Proceedings of the Second International Workshop on Statistical Database Management
  • Year:
  • 1983

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data analysis is typically an iterative process in which the choice of the next analysis operation is largely determined by the results of previous operations on the data set. With large data sets, many analysis paths may be explored before meaningful results are obtained. Along eachpath, the analyst creates a sequence of "data analysis environments," each environment being a frame or "snapshop" of the data set and associated descriptions, conditions, models, and analysis results. The data analysis environment may be changed incrementally through temprary data modifications, subsets, samples, or statistical operations; or, the analyst may wish to restore the conditions of a previous environment as a starting point fram which a new analysis path can be generated. Existing analysis systems, however, lack facilities to maintain, save, or restore all of the components required to completely describe or reconstruct a data analysis environment.This paper describes ongoing research at Pacific Northwest Laboratory (PNL) in data management and display techniques for multiple data analysis environments. Specifically, research is being conducted in four major areas: (1) the development of a model of the data analysis process incorporating the concepts of data analysis environments; (2) the design and use of data modification definitions (differential files) to represent multiple versions of a large data base; (3) the use of data dictionaries/directories to manage, describe, and control multiple data analysis environments; and (4) the application of graphical display and interaction techniques to the examination and selection of data analysis environments. The results of these research efforts will be integrated to provide a new dimension in interactive data analysis.