Learning in graphical models
Probabilistic Networks and Expert Systems
Probabilistic Networks and Expert Systems
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Artificial Intelligence: A Modern Approach
Artificial Intelligence: A Modern Approach
A Metric for Distributions with Applications to Image Databases
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Fine-Grained Activity Recognition by Aggregating Abstract Object Usage
ISWC '05 Proceedings of the Ninth IEEE International Symposium on Wearable Computers
MauveDB: supporting model-based user views in database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Fast particle smoothing: if I had a million particles
ICML '06 Proceedings of the 23rd international conference on Machine learning
Flowcube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Approximate encoding for direct access and query processing over compressed bitmaps
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Trio: a system for data, uncertainty, and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
OLAP over uncertain and imprecise data
The VLDB Journal — The International Journal on Very Large Data Bases
Learning and inferring transportation routines
Artificial Intelligence
Estimating statistical aggregates on probabilistic data streams
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient pattern matching over event streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient storage scheme and query processing for supply chain management using RFID
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Event queries on correlated probabilistic streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
BayesStore: managing large, uncertain data repositories with probabilistic graphical models
Proceedings of the VLDB Endowment
Exploiting shared correlations in probabilistic databases
Proceedings of the VLDB Endowment
Online Filtering, Smoothing and Probabilistic Modeling of Streaming data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Access Methods for Markovian Streams
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Probabilistic Inference over RFID Streams in Mobile Environments
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Ef?cient Query Evaluation over Temporally Correlated Probabilistic Streams
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Indexing correlated probabilistic databases
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Lahar: warehousing markovian streams
Lahar: warehousing markovian streams
Towards expressive publish/subscribe systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
A tutorial on particle filters for online nonlinear/non-GaussianBayesian tracking
IEEE Transactions on Signal Processing
Hi-index | 0.00 |
A large amount of the world's data is both sequential and low-level. Many applications need to query higher-level information (e.g., words and sentences) that is inferred from these low-level sequences (e.g., raw audio signals) using a model (e.g., a hidden Markov model). This inference process is typically statistical, resulting in high-level sequences that are imprecise. Once archived, these imprecise streams are difficult to query efficiently because of their rich semantics and large volumes, forcing applications to sacrifice either performance or accuracy. There exists little work, however, that characterizes this trade-off space and helps applications make an appropriate choice. In this paper, we study the effects - on both efficiency and accuracy - of various stream approximations such as ignoring correlations, ignoring low-probability states, or retaining only the single most likely sequence of events. Through experiments on a real-world RFID data set, we identify conditions under which various approximations can improve performance by several orders of magnitude, with only minimal effects on query results. We also identify cases when the full rich semantics are necessary. This study is the first to evaluate the cost vs. quality trade-off of imprecise stream models. We perform this study using Lahar, a prototype Markovian stream warehouse. A secondary contribution of this paper is the development of query semantics and algorithms for processing aggregation queries on the output of pattern queries-we develop these queries in order to more fully understand the effects of approximation on a wider set of imprecise stream queries.