An introduction to symbolic data analysis and the SODAS software

  • Authors:
  • Edwin Diday;Floriana Esposito

  • Affiliations:
  • University Paris 9 Dauphine, Ceremade. Pl. Du Mle de L. de Tassigny, 75016 Paris, France;Università di Bari, Dipartimento di Informatica v. Orabona 4 70125 Bari, Italy

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The data descriptions of the units are called "symbolic" when they are more complex than standard ones, due to the fact that they contain internal variations and are structured. Symbolic data arise from many sources, for instance when summarizing huge Relational Data Bases by their underlying concepts. "Extracting knowledge" means obtaining explanatory results, and for this reason, "symbolic objects" are introduced and studied in this paper. They model concepts and constitute an explanatory output for data analysis. Moreover, they can be used to define queries of a Relational Data Base and propagate concepts between Data Bases. We define "Symbolic Data Analysis" (SDA) as the extension of standard Data Analysis to symbolic data tables as input in order to find symbolic objects as output. Any SDA is based on four spaces: the space of individuals, the space of concepts, the space of descriptions modelling individuals or classes of individuals, the space of symbolic objects modelling concepts. New problems arise from these four spaces, such as the quality, robustness and reliability of the approximation of a concept given by a symbolic object, the symbolic description of a class, the consensus between symbolic descriptions, and so on. In this paper we give an overview of recent developments in SDA. We briefly describe some SDA tools and methods and, in particular, we describe some dissimilarity methods for symbolic objects which are central to the majority of symbolic data analysis methods. Finally, we introduce the software prototype, developed by 17 teams from nine countries involved in the SODAS EUROSTAT project.