A Framework for Efficient Data Analytics through Automatic Configuration and Customization of Scientific Workflows

  • Authors:
  • Matheus Hauder;Yolanda Gil;Yan Liu

  • Affiliations:
  • -;-;-

  • Venue:
  • ESCIENCE '11 Proceedings of the 2011 IEEE Seventh International Conference on eScience
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data analytics involves choosing between many different algorithms and experimenting with possible combinations of those algorithms. Existing approaches however do not support scientists with the laborious tasks of exploring the design space of computational experiments. We have developed a framework to assist scientists with data analysis tasks in particular machine learning and data mining. It takes advantage of the unique capabilities of the Wings workflow system to reason about semantic constraints. We show how the framework can rule out invalid workflows and help scientists to explore the design space. We demonstrate our system in the domain of text analytics, and outline the benefits of our approach.