A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient join processing over uncertain data
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
From complete to incomplete information and back
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient query evaluation on probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Linear road: a stream data management benchmark
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Event queries on correlated probabilistic streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Semantics and implementation of continuous sliding window queries over data streams
ACM Transactions on Database Systems (TODS)
Probabilistic databases: diamonds in the dirt
Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Database Support for Probabilistic Attributes and Tuples
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Handling Uncertain Data in Array Database Systems
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Online Filtering, Smoothing and Probabilistic Modeling of Streaming data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Probabilistic Inference over RFID Streams in Mobile Environments
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Ef?cient Query Evaluation over Temporally Correlated Probabilistic Streams
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
PrDB: managing and exploiting rich correlations in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
ERACER: a database approach for statistical inference and data cleaning
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
PODS: a new model and processing algorithms for uncertain data streams
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Parallel processing of continuous queries over data streams
Distributed and Parallel Databases
AINTEC'06 Proceedings of the Second Asian international conference on Technologies for Advanced Heterogeneous Networks
PLR: a benchmark for probabilistic data stream management systems
ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part III
Hi-index | 0.00 |
Data stream and probabilistic data have been recently considered noticeably in isolation. However, there are many applications including sensor data management systems and object monitoring systems which need both issues in tandem. The existence of complex correlations and lineages prevents Probabilistic DBMSs (PDBMSs) from continuously querying temporal positioning and sensed data. Our main contribution is developing a new system to continuously run monitoring queries on probabilistic data streams with a satisfactory fast speed, while being faithful to correlations and uncertainty aspects of data. We designed a new data model for probabilistic data streams. We also presented new query operators to implement threshold SPJ queries with aggregation (SPJA queries). In addition and most importantly, we build a java-based working system, called Xtream, which supports uncertainty from input data streams to final query results. Unlike probabilistic databases, the data-driven design of Xtream makes it possible to continuously query high-volumes of bursty probabilistic data streams. In this paper, after reviewing main characteristics and motivating applications for probabilistic data streams, we present our new data model. Then we focus on algorithms and approximations for basic operators (select, project, join, and aggregate). Finally, we compare our prototype with Orion the only existing probabilistic DBMS that supports continuous distributions. Our experiments demonstrate how Xtream outperforms Orion w.r.t. efficiency metrics such as tuple latency (response time) and throughput as well as accuracy, which are critical parameters in any probabilistic data stream management system.