Efficient mining of recurrent rules from a sequence database

Authors:
David Lo;Siau-Cheng Khoo;Chao Liu
Affiliations:
Department of Computer Science, National University of Singapore;Department of Computer Science, National University of Singapore;Department of Computer Science, University of Illinois at Urbana-Champaign
Venue:
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Year:
2008

Citing 22
Cited 5

A Specifier's Introduction to Formal Methods

Computer
Patterns in property specifications for finite-state verification

Proceedings of the 21st international conference on Software engineering
Model checking

Model checking
Bandera: extracting finite-state models from Java source code

Proceedings of the 22nd international conference on Software engineering
KDD-Cup 2000 organizers' report: peeling the onion

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Bugs as deviant behavior: a general approach to inferring errors in systems code

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Mining specifications

POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Introduction to Automata Theory, Languages and Computability

Introduction to Automata Theory, Languages and Computability
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Managing Interesting Rules in Sequence Mining

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Light-Weight Product-Lines for Evolution and Maintenance of Web Sites

CSMR '03 Proceedings of the Seventh European Conference on Software Maintenance and Reengineering
BIDE: Efficient Mining of Frequent Closed Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Privacy and Contextual Integrity: Framework and Applications

SP '06 Proceedings of the 2006 IEEE Symposium on Security and Privacy
Perracotta: mining temporal API rules from imperfect traces

Proceedings of the 28th international conference on Software engineering
SMArTIC: towards building an accurate, robust and scalable specification miner

Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
How Bayesians Debug

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Efficient mining of iterative patterns for software specification discovery

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining temporal specifications for error detection

TACAS'05 Proceedings of the 11th international conference on Tools and Algorithms for the Construction and Analysis of Systems

Mining past-time temporal rules from execution traces

WODA '08 Proceedings of the 2008 international workshop on dynamic analysis: held in conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2008)
Mining patterns and rules for software specification discovery

Proceedings of the VLDB Endowment
Analysis and Reuse of Plots Using Similarity and Analogy

ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
Specification mining of symbolic scenario-based models

Proceedings of the 8th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Non-redundant sequential rules-Theory and algorithm

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study a novel problem of mining significant recurrent rules from a sequence database. Recurrent rules have the form "whenever a series of precedent events occurs, eventually a series of consequent events occurs". Recurrent rules are intuitive and characterize behaviors in many domains. An example is in the domain of software specifications, in which the rules capture a family of program properties beneficial to program verification and bug detection. Recurrent rules generalize existing work on sequential and episode rules by considering repeated occurrences of premise and consequent events within a sequence and across multiple sequences, and by removing the "window" barrier. Bridging the gap between mined rules and program specifications, we formalize our rules in linear temporal logic. We introduce and apply a novel notion of rule redundancy to ensure efficient mining of a compact representative set of rules. Performance studies on benchmark datasets and a case study on an industrial system have been performed to show the scalability and utility of our approach.