Random sampling with a reservoir
ACM Transactions on Mathematical Software (TOMS)
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management
The VLDB Journal — The International Journal on Very Large Data Bases
PSoup: a system for streaming queries over streaming data
The VLDB Journal — The International Journal on Very Large Data Bases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
SPADE: the system s declarative stream processing engine
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the Third ACM International Conference on Distributed Event-Based Systems
Facilitating fine grained data provenance using temporal data model
Proceedings of the Seventh International Workshop on Data Management for Sensor Networks
PIKM 2010: ACM workshop for ph.d. students in information and knowledge management
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Emerging multidisciplinary research across database management systems
ACM SIGMOD Record
Hi-index | 0.00 |
One of the major requirements for e-science applications handling sensor data, is reproducibility of results. Several optimization and scalability problems exist where the reproducibility of results remains guaranteed. Firstly, various data streams need to be coordinated to optimize the accuracy and processing of the results. Secondly, because of the high volume of streaming data and a series of processing steps to be performed on that data, demand for disk space may grow unacceptably high. Lastly, reproducibility in a decentralized scenario may be difficult to achieve because of data replication. This paper introduces and addresses these challenges which arise for optimizing the process of achieving reproducibility of results.