Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
One-Pass Wavelet Decompositions of Data Streams
IEEE Transactions on Knowledge and Data Engineering
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Similarity Search Over Time-Series Data Using Wavelets
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Wavelet synopsis for data streams: minimizing non-euclidean error
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
One-pass wavelet synopses for maximum-error metrics
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Indexing multi-dimensional uncertain data with arbitrary probability density functions
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient range-constrained similarity search on wavelet synopses over multiple streams
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Sketching probabilistic data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Estimating statistical aggregates on probabilistic data streams
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient indexing methods for probabilistic threshold queries over uncertain data
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Probabilistic skylines on uncertain data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Proceedings of the VLDB Endowment
Efficiently Answering Probabilistic Threshold Top-k Queries on Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Efficient Processing of Top-k Queries in Uncertain Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On High Dimensional Indexing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Pattern Matching over Cloaked Time Series
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On Unifying Privacy and Uncertain Data Models
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Fast approximate wavelet tracking on streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Efficient join processing on uncertain data streams
Proceedings of the 18th ACM conference on Information and knowledge management
DUST: a generalized notion of similarity between uncertain time series
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
DUST: a generalized notion of similarity between uncertain time series
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Top-k similarity search on uncertain trajectories
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Similarity matching for uncertain time series: analytical and experimental comparison
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Querying and Mining Uncertain Spatio-Temporal Data
Managing uncertain spatio-temporal data
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Querying and Mining Uncertain Spatio-Temporal Data
Uncertain time-series similarity: return to the basics
Proceedings of the VLDB Endowment
Indexing uncertain spatio-temporal data
Proceedings of the 21st ACM international conference on Information and knowledge management
Probabilistic distance based abnormal pattern detection in uncertain series data
Knowledge-Based Systems
A probabilistic approach to correlation queries in uncertain time series data
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
We present PROUD -- A PRObabilistic approach to processing similarity queries over Uncertain Data streams, where the data streams here are mainly time series streams. In contrast to data with certainty, an uncertain series is an ordered sequence of random variables. The distance between two uncertain series is also a random variable. We use a general uncertain data model, where only the mean and the deviation of each random variable at each timestamp are available. We derive mathematical conditions for progressively pruning candidates to reduce the computation cost. We then apply PROUD to a streaming environment where only sketches of streams, like wavelet synopses, are available. Extensive experiments are conducted to evaluate the effectiveness of PROUD and compare it with Det, a deterministic approach that directly processes data without considering uncertainty. The results show that, compared with Det, PROUD offers a flexible trade-off between false positives and false negatives by controlling a threshold, while maintaining a similar computation cost. In contrast, Det does not provide such flexibility. This trade-off is important as in some applications false negatives are more costly, while in others, it is more critical to keep the false positives low.