Approximate inference of functional dependencies from relations
ICDT '92 Selected papers of the fourth international conference on Database theory
Solving satisfiability and implication problems in database systems
ACM Transactions on Database Systems (TODS)
Computational Complexity of Machine Learning
Computational Complexity of Machine Learning
Applying approximate order dependency to reduce indexing space
SIGMOD '82 Proceedings of the 1982 ACM SIGMOD international conference on Management of data
Sort sets in the relational model
PODS '83 Proceedings of the 2nd ACM SIGACT-SIGMOD symposium on Principles of database systems
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Exploiting k-constraints to reduce memory overhead in continuous queries over data streams
ACM Transactions on Database Systems (TODS)
Longest increasing subsequences in sliding windows
Theoretical Computer Science
"One Size Fits All": An Idea Whose Time Has Come and Gone
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Streaming pattern discovery in multiple time-series
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Longest increasing subsequences in windows based on canonical antichain partition
Theoretical Computer Science
Estimating the sortedness of a data stream
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Checks and balances: monitoring data quality problems in network traffic databases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Extending dependencies with conditions
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
On generating near-optimal tableaux for conditional functional dependencies
Proceedings of the VLDB Endowment
Discovering data quality rules
Proceedings of the VLDB Endowment
Discovering Conditional Functional Dependencies
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Data Auditor: exploring data quality and semantics using pattern tableaux
Proceedings of the VLDB Endowment
Differential dependencies: Reasoning and discovery
ACM Transactions on Database Systems (TODS)
Discovering lag intervals for temporal dependencies
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Editorial: Efficient discovery of similarity constraints for matching dependencies
Data & Knowledge Engineering
Sequential dependency computation via geometric data structures
Computational Geometry: Theory and Applications
Expressiveness and complexity of order dependencies
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
We study sequential dependencies that express the semantics of data with ordered domains and help identify quality problems with such data. Given an interval g, we write X →g Y to denote that the difference between the Y -attribute values of any two consecutive records, when sorted on X, must be in g. For example, time →(0,∞) sequence_number indicates that sequence numbers are strictly increasing over time, whereas sequence_number →[4, 5] time means that the time "gaps" between consecutive sequence numbers are between 4 and 5. Sequential dependencies express relationships between ordered attributes, and identify missing (gaps too large), extraneous (gaps too small) and out-of-order data. To make sequential dependencies applicable to real-world data, we relax their requirements and allow them to hold approximately (with some exceptions) and conditionally (on various subsets of the data). This paper proposes the notion of conditional approximate sequential dependencies and provides an efficient framework for discovering pattern tableaux, which are compact representations of the subsets of the data (i.e., ranges of values of the ordered attributes) that satisfy the underlying dependency. We present analyses of our proposed algorithms, and experiments on real data demonstrating the efficiency and utility of our framework.