MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bioinformatics
Large-Scale DNA sequence analysis in the cloud: a stream-based approach
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Hi-index | 0.00 |
In this paper, we propose to demonstrate a "stream-as-you-go" approach that minimizes the data transfer time of data- and compute-intensive scientific applications deployed in the cloud, by making them incrementally processable. We describe a system that implements this approach based on the IBM InfoSphere Streams computing platform deployed over Amazon EC2. The functionality, performance, and usability of the system will be demonstrated through two DNA sequence analysis applications.