SINDBAD and SiQL: An Inductive Database and Query Language in the Relational Model

  • Authors:
  • Jörg Wicker;Lothar Richter;Kristina Kessler;Stefan Kramer

  • Affiliations:
  • Institut für Informatik I12, Technische Universität München, Garching b. München, Germany D-85748;Institut für Informatik I12, Technische Universität München, Garching b. München, Germany D-85748;Institut für Informatik I12, Technische Universität München, Garching b. München, Germany D-85748;Institut für Informatik I12, Technische Universität München, Garching b. München, Germany D-85748

  • Venue:
  • ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this demonstration, we will present the concepts and an implementation of an inductive database--- as proposed by Imielinski and Mannila --- in the relational model. The goal is to support all steps of the knowledge discovery process on the basis of queries to a database system. The query language SiQL (structured inductive query language), an SQL extension, offers query primitives for feature selection, discretization, pattern mining, clustering, instance-based learning and rule induction. A prototype system processing such queries was implemented as part of the SINDBAD (structured inductive database development) project. To support the analysis of multi-relational data, we incorporated multi-relational distance measures based on set distances and recursive descent. The inclusion of rule-based classification models made it necessary to extend the data model and software architecture significantly. The prototype is applied to three different data sets: gene expression analysis, gene regulation prediction and structure-activity relationships (SARs) of small molecules.