NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data management and transfer in high-performance computational grid environments
Parallel Computing - Parallel data-intensive algorithms and applications
GriPhyN and LIGO, Building a Virtual Data Grid for Gravitational Wave Scientists
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
High Performance Threaded Data Streaming for Large Scale Simulations
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient access to many samall files in a filesystem for grid computing
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
An integrated resource management and scheduling system for grid data streaming applications
GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
Hi-index | 0.00 |
Data streaming management and scheduling is required by many grid computing applications, especially when the volume of data to be processed is extremely high while available storage is relatively limited. Big bulk of data from scientific experiments is usually partitioned into lots of small files (LOSF), bringing challenges to data streaming supports. Block-based data transferring is proposed in this work and implemented using GridFTP, where the number of blocks or the size of each block must be carefully scheduled, taking makespan and available storage into account simultaneously. To increase processing efficiency, data streaming and processing have to be performed concurrently; data streaming scheduling must be storage-aware to avoid data overflow. Experimental results show that the optimization method for block-based concurrent and storage-aware data streaming proposed in this work is efficient to deal with the LOSF problem with a relatively good performance in terms of makespan and storage usage.