Generating Data Analysis Programs from Statistical Models

  • Authors:
  • Bernd Fischer;Johann Schumann;Thomas Pressburger

  • Affiliations:
  • -;-;-

  • Venue:
  • SAIG '00 Proceedings of the International Workshop on Semantics, Applications, and Implementation of Program Generation
  • Year:
  • 2000
  • Synthesizing Certified Code

    FME '02 Proceedings of the International Symposium of Formal Methods Europe on Formal Methods - Getting IT Right

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extracting information from data, often also called data analysis, is an important scientific task. Statistical approaches, which use methods from probability theory and numerical analysis, are well-founded but difficult to implement: the development of a statistical data analysis program for any given application is time-consuming and requires knowledge and experience in several areas. In this paper, we describe AUTOBAYES, a high-level generator system for data analysis programs from statistical models. A statistical model specifies the properties for each problem variable (i.e., observation or parameter) and its dependencies in the form of a probability distribution. It is thus a fully declarative problem description, similar in spirit to a set of differential equations. From this model, AUTOBAYES generates optimized and fully commented C/C++ code which can be linked dynamically into the Matlab and Octave environments. Code is generated by schema-guided deductive synthesis. A schema consists of a code template and applicability constraints which are checked against the model during synthesis using theorem proving technology. AUTOBAYES augments schema-guided synthesis by symbolic-algebraic computation and can thus derive closed-form solutions for many problems. In this paper, we outline the AUTOBAYES system, its theoretical foundations in Bayesian probability theory, and its application by means of a detailed example.