Sequential dependencies

Authors:
Lukasz Golab;Howard Karloff;Flip Korn;Avishek Saha;Divesh Srivastava
Affiliations:
AT&T Labs--Research;AT&T Labs--Research;AT&T Labs--Research;University of Utah;AT&T Labs--Research
Venue:
Proceedings of the VLDB Endowment
Year:
2009

Citing 20
Cited 6

Approximate inference of functional dependencies from relations

ICDT '92 Selected papers of the fourth international conference on Database theory
Solving satisfiability and implication problems in database systems

ACM Transactions on Database Systems (TODS)
Computational Complexity of Machine Learning

Computational Complexity of Machine Learning
Applying approximate order dependency to reduce indexing space

SIGMOD '82 Proceedings of the 1982 ACM SIGMOD international conference on Management of data
Sort sets in the relational model

PODS '83 Proceedings of the 2nd ACM SIGACT-SIGMOD symposium on Principles of database systems
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Counting inversions in lists

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Exploiting k-constraints to reduce memory overhead in continuous queries over data streams

ACM Transactions on Database Systems (TODS)
Longest increasing subsequences in sliding windows

Theoretical Computer Science
"One Size Fits All": An Idea Whose Time Has Come and Gone

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Streaming pattern discovery in multiple time-series

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Longest increasing subsequences in windows based on canonical antichain partition

Theoretical Computer Science
Estimating the sortedness of a data stream

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
StatStream: statistical monitoring of thousands of data streams in real time

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Checks and balances: monitoring data quality problems in network traffic databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Extending dependencies with conditions

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
On generating near-optimal tableaux for conditional functional dependencies

Proceedings of the VLDB Endowment
Discovering data quality rules

Proceedings of the VLDB Endowment
Discovering Conditional Functional Dependencies

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering

Data Auditor: exploring data quality and semantics using pattern tableaux

Proceedings of the VLDB Endowment
Differential dependencies: Reasoning and discovery

ACM Transactions on Database Systems (TODS)
Discovering lag intervals for temporal dependencies

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Editorial: Efficient discovery of similarity constraints for matching dependencies

Data & Knowledge Engineering
Sequential dependency computation via geometric data structures

Computational Geometry: Theory and Applications
Expressiveness and complexity of order dependencies

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study sequential dependencies that express the semantics of data with ordered domains and help identify quality problems with such data. Given an interval g, we write X →g Y to denote that the difference between the Y -attribute values of any two consecutive records, when sorted on X, must be in g. For example, time →(0,∞) sequence_number indicates that sequence numbers are strictly increasing over time, whereas sequence_number →[4, 5] time means that the time "gaps" between consecutive sequence numbers are between 4 and 5. Sequential dependencies express relationships between ordered attributes, and identify missing (gaps too large), extraneous (gaps too small) and out-of-order data. To make sequential dependencies applicable to real-world data, we relax their requirements and allow them to hold approximately (with some exceptions) and conditionally (on various subsets of the data). This paper proposes the notion of conditional approximate sequential dependencies and provides an efficient framework for discovering pattern tableaux, which are compact representations of the subsets of the data (i.e., ranges of values of the ordered attributes) that satisfy the underlying dependency. We present analyses of our proposed algorithms, and experiments on real data demonstrating the efficiency and utility of our framework.