On the representation and querying of sets of possible worlds
SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Accommodating imprecision in database systems: issues and solutions
ACM SIGMOD Record - Directions for future database research & development
Evaluating aggregates in possibilistic relational databases
Data & Knowledge Engineering
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Generalized union and project operations for pooling uncertain and imprecise information
Data & Knowledge Engineering
A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
ProbView: a flexible probabilistic database system
ACM Transactions on Database Systems (TODS)
PSQL: a query language for probabilistic relational data
Data & Knowledge Engineering - Special issue on ER '97
Consistent query answers in inconsistent databases
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The Management of Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
Evaluating Aggregate Operations Over Imprecise Data
IEEE Transactions on Knowledge and Data Engineering
Aggregation of Imprecise and Uncertain Information in Databases
IEEE Transactions on Knowledge and Data Engineering
The Theory of Probabilistic Databases
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Scalar aggregation in inconsistent databases
Theoretical Computer Science - Database theory
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aggregate operators in probabilistic databases
Journal of the ACM (JACM)
Foundations of probabilistic answers to queries
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
OLAP over uncertain and imprecise data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Sketching probabilistic data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Management of probabilistic data: foundations and challenges
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Estimating statistical aggregates on probabilistic data streams
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Event queries on correlated probabilistic streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Cascadia: A System for Specifying, Detecting, and Managing RFID Events
Proceedings of the 6th international conference on Mobile systems, applications, and services
Probabilistic top-k and ranking-aggregate queries
ACM Transactions on Database Systems (TODS)
Estimating statistical aggregates on probabilistic data streams
ACM Transactions on Database Systems (TODS)
Managing Probabilistic Data with MystiQ: The Can-Do, the Could-Do, and the Can't-Do
SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Sliding-window top-k queries on uncertain streams
Proceedings of the VLDB Endowment
Systems aspects of probabilistic data management
Proceedings of the VLDB Endowment
Probabilistic databases: diamonds in the dirt
Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Continuously monitoring top-k uncertain data streams: a probabilistic threshold method
Distributed and Parallel Databases
The trichotomy of HAVING queries on a probabilistic database
The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic histograms for probabilistic data
Proceedings of the VLDB Endowment
Efficient evaluation of HAVING queries on a probabilistic database
DBPL'07 Proceedings of the 11th international conference on Database programming languages
Aggregate queries for discrete and continuous probabilistic XML
Proceedings of the 13th International Conference on Database Theory
Sliding-window top-k queries on uncertain streams
The VLDB Journal — The International Journal on Very Large Data Bases
Conditioning and aggregating uncertain data streams: going beyond expectations
Proceedings of the VLDB Endowment
Sensitivity analysis and explanations for robust query evaluation in probabilistic databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Handling ER-topk query on uncertain streams
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Continuous inverse ranking queries in uncertain streams
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Capturing continuous data and answering aggregate queries in probabilistic XML
ACM Transactions on Database Systems (TODS)
An embedded co-processor for accelerating window joins over uncertain data streams
Microprocessors & Microsystems
CLARO: modeling and processing uncertain data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Range counting coresets for uncertain data
Proceedings of the twenty-ninth annual symposium on Computational geometry
Efficient and scalable monitoring and summarization of large probabilistic data
Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
Probabilistic k-skyband operator over sliding windows
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Probabilistic skyline operator over sliding windows
Information Systems
Hi-index | 0.00 |
We study the problem of computing aggregation operators on probabilistic data in an I/O efficient manner. Algorithms for aggregation operators such as SUM, COUNT, AVG, and MIN/MAX are crucial to applications on probabilistic databases. We give a generalization of the classical data stream model to handle probabilistic data, called probabilistic streams, in order to analyze the I/O-requirements of our algorithms. Whereas the algorithms for SUM and COUNT turn out to be simple, the problem is harder for both AVG and MIN/MAX. Although data stream algorithms typically use randomness, all of the algorithms we present are deterministic. For MIN and MAX, we obtain efficient one-pass data stream algorithms for estimating each of these quantities with relative accuracy (1 + ε), using constant update time per element and O(1/ε lg R) space, where each element has a value between 1 and R. For AVG, we present a new data stream algorithm for estimating its value to a relative accuracy (1 + ε) in O(log n) passes over the data with O(1/ε log2 n) space and update time O(1/ε log n) per element. On the other hand, we prove a space lower bound of Ω(n) for any exact one-pass deterministic data stream algorithm. Complementing this result, we also present an O(n log2 n)-time exact deterministic algorithm which uses O(n) space (thus removing the data-streaming restriction), improving dramatically on the previous O(n3)-time algorithm. Our algorithms for AVG involve a novel technique based on generating functions and numerical integration, which may be of independent interest. Finally, we provide an experimental analysis and show that our algorithms, coupled with additional heuristics, have excellent performance over large data sets.