Data-intensive architecture for scientific knowledge discovery

  • Authors:
  • Malcolm Atkinson;Chee Sun Liew;Michelle Galea;Paul Martin;Amrey Krause;Adrian Mouat;Oscar Corcho;David Snelling

  • Affiliations:
  • School of Informatics, University of Edinburgh, Edinburgh, UK EH8 9AB;Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia 50603;School of Informatics, University of Edinburgh, Edinburgh, UK EH8 9AB;School of Informatics, University of Edinburgh, Edinburgh, UK EH8 9AB;EPCC, University of Edinburgh, Edinburgh, UK EH9 3JZ;EPCC, University of Edinburgh, Edinburgh, UK EH9 3JZ;Departamento de Inteligencia Artificial, Facultad de Informática, Universidad Politécnica de Madrid, Boadilla del Monte, Spain 28660;Fujitsu Laboratories of Europe Limited, Hayes, UK UB4 8FE

  • Venue:
  • Distributed and Parallel Databases
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology.