Analyzing graph databases by aggregate queries

  • Authors:
  • Anton Dries;Siegfried Nijssen

  • Affiliations:
  • K.U. Leuven, Leuven, Belgium;K.U. Leuven, Leuven, Belgium

  • Venue:
  • Proceedings of the Eighth Workshop on Mining and Learning with Graphs
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

An important step in data analysis is the exploration of data. For traditional relational databases one of the most powerful tools for performing such analysis is the relational database and the aggregates and rankings that they can compute: for instance, simple statistics such as the average number of links between two types of entities (relations) are easily computed using a query on a relational database and may already provide valuable information. However, for the exploration of graph data, relational databases may not be most practical and scalable. For instance, a statistic such as the shortest path between two given nodes cannot be computed by a relational database. Surprisingly, however, tools for querying graph and network databases are much less well developed than for relational data, and only recently an increasing number of studies are devoted to graph or network databases. Our position is that the development of such graph databases is important both to make basic graph mining easier and to prepare data for more complex types of analysis. An important component of such databases is the language that is used to enable aggregating queries, such as shortest path queries. In this paper, we propose an extension to a previously proposed query language. This extension allows for querying and analyzing databases by using aggregates and ranking. A notable feature of our language is that it also supports probabilistic graph queries by conceiving of such queries as aggregating queries. We demonstrate its value on a simple data analysis task.