Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Counting Distinct Elements in a Data Stream
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Reconstructing strings from random traces
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for dynamic geometric problems over data streams
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Load management and high availability in the Medusa distributed stream processing system
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Graph distances in the streaming model: the value of space
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
OLAP over uncertain and imprecise data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
The space complexity of pass-efficient algorithms for clustering
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
On graph problems in a semi-streaming model
Theoretical Computer Science - Automata, languages and programming: Algorithms and complexity (ICALP-A 2004)
Efficient allocation algorithms for OLAP over imprecise data
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Sketching probabilistic data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient aggregation algorithms for probabilistic data
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Sketching probabilistic data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Event queries on correlated probabilistic streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Finding frequent items in probabilistic data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximation algorithms for clustering uncertain data
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Cascadia: A System for Specifying, Detecting, and Managing RFID Events
Proceedings of the 6th international conference on Mobile systems, applications, and services
Sliding-window top-k queries on uncertain streams
Proceedings of the VLDB Endowment
Finding frequent items in data streams
Proceedings of the VLDB Endowment
PROUD: a probabilistic approach to processing similarity queries over uncertain data streams
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Efficiently Clustering Probabilistic Data Streams
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Consensus answers for queries over probabilistic databases
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Finding the frequent items in streams of data
Communications of the ACM - A View of Parallel Computing
Continuously monitoring top-k uncertain data streams: a probabilistic threshold method
Distributed and Parallel Databases
Probabilistic histograms for probabilistic data
Proceedings of the VLDB Endowment
Methods for finding frequent items in data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Proceedings of the forty-second ACM symposium on Theory of computing
PODS: a new model and processing algorithms for uncertain data streams
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Sliding-window top-k queries on uncertain streams
The VLDB Journal — The International Journal on Very Large Data Bases
On wavelet decomposition of uncertain time series data sets
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Handling ER-topk query on uncertain streams
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Continuous monitoring of skylines over uncertain data streams
Information Sciences: an International Journal
Efficient subject-oriented evaluating and mining methods for data with schema uncertainty
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Aggregate queries on probabilistic record linkages
Proceedings of the 15th International Conference on Extending Database Technology
An embedded co-processor for accelerating window joins over uncertain data streams
Microprocessors & Microsystems
FGIT'12 Proceedings of the 4th international conference on Future Generation Information Technology
Range counting coresets for uncertain data
Proceedings of the twenty-ninth annual symposium on Computational geometry
Efficient and scalable monitoring and summarization of large probabilistic data
Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
Probabilistic k-skyband operator over sliding windows
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Probabilistic skyline operator over sliding windows
Information Systems
Approximation trade-offs in a Markovian stream warehouse: An empirical study
Information Systems
Hi-index | 0.00 |
The probabilistic-stream model was introduced by Jayram et al. [20].It is a generalization of the data stream model that issuited to handling "probabilistic" data, where each item of the stream represents a probability distribution over a set of possible events. Therefore, a probabilistic stream determines a distribution over apotentially exponential number of classical "deterministic" streams where each item is deterministically one of the domain values. Designing efficient aggregation algorithms for probabilistic data is crucial for handling uncertainty in data-centric applications such as OLAP. Such algorithms are also useful in a variety of other setting including analyzing search engine traffic and aggregation in sensor networks. We present algorithms for computing commonly used aggregates ona probabilistic stream. We present the first one pass streaming algorithms for estimating the expected mean of a probabilistic stream, improving upon results in [20]. Next, we consider the problem of estimating frequency moments for probabilistic data. We propose a general approach to obtain unbiased estimators working over probabilistic data by utilizing unbiased estimators designed for standard streams. Applying this approach, we extend a classical data stream algorithm to obtain a one-pass algorithm for estimating F2, the second frequency moment. We present the first known streaming algorithms forestimating F0, the number of distinct items on probabilistic streams.Our work also gives an efficient one-pass algorithm for estimatingthe median of a probabilistic stream.