Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Counting Distinct Elements in a Data Stream
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Reconstructing strings from random traces
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for dynamic geometric problems over data streams
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Load management and high availability in the Medusa distributed stream processing system
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Graph distances in the streaming model: the value of space
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
OLAP over uncertain and imprecise data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
The space complexity of pass-efficient algorithms for clustering
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
On graph problems in a semi-streaming model
Theoretical Computer Science - Automata, languages and programming: Algorithms and complexity (ICALP-A 2004)
Efficient allocation algorithms for OLAP over imprecise data
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Sketching probabilistic data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient aggregation algorithms for probabilistic data
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Conditioning and aggregating uncertain data streams: going beyond expectations
Proceedings of the VLDB Endowment
Space-efficient estimation of statistics over sub-sampled streams
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
CLARO: modeling and processing uncertain data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Parallel skyline queries over uncertain data streams in cloud computing environments
International Journal of Web and Grid Services
Hi-index | 0.00 |
The probabilistic stream model was introduced by Jayram et al. [2007]. It is a generalization of the data stream model that is suited to handling probabilistic data, where each item of the stream represents a probability distribution over a set of possible events. Therefore, a probabilistic stream determines a distribution over a potentially exponential number of classical deterministic streams, where each item is deterministically one of the domain values. We present algorithms for computing commonly used aggregates on a probabilistic stream. We present the first one pass streaming algorithms for estimating the expected mean of a probabilistic stream. Next, we consider the problem of estimating frequency moments for probabilistic data. We propose a general approach to obtain unbiased estimators working over probabilistic data by utilizing unbiased estimators designed for standard streams. Applying this approach, we extend a classical data stream algorithm to obtain a one-pass algorithm for estimating F2, the second frequency moment. We present the first known streaming algorithms for estimating F0, the number of distinct items on probabilistic streams. Our work also gives an efficient one-pass algorithm for estimating the median, and a two-pass algorithm for estimating the range.