Efficient mining of frequent XML query patterns with repeating-siblings

Authors:
Liang Huai Yang;Mong Li Lee;Wynne Hsu;Decai Huang;Limsoon Wong
Affiliations:
College of Information Engineering, Zhejiang University of Technology, Hangzhou, China;School of Computing, National University of Singapore, Singapore;School of Computing, National University of Singapore, Singapore;College of Information Engineering, Zhejiang University of Technology, Hangzhou, China;School of Computing, National University of Singapore, Singapore
Venue:
Information and Software Technology
Year:
2008

Citing 32
Cited 2

Knowledge discovery from structural data

Journal of Intelligent Information Systems
Algorithmics and applications of tree and graph searching

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Containment and equivalence for an XPath fragment

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
APEX: an adaptive path index for XML data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
XCache: a semantic caching system for XML queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Discovering Structural Association of Semistructured Data

IEEE Transactions on Knowledge and Data Engineering
Graph-Based Data Mining

IEEE Intelligent Systems
Selection of Views to Materialize in a Data Warehouse

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
The view selection problem for XML content based routing

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Frequent Quer Patterns from XML Queries

DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Mining Molecular Fragments: Finding Relevant Substructures of Molecules

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
TreeFinder: a First Step towards XML Data Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Indexing and Mining Free Trees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Extracting association rules from XML documents using XQuery

WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Unordered Tree Mining with Applications to Phylogeny

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
2PXMiner: an efficient two pass mining of frequent XML query patterns

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Closed Relational Graphs with Connectivity Constraints

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Query caching and view selection for XML databases

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Discovering large dense subgraphs in massive graphs

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mining coherent dense subgraphs across massive biological networks for functional discovery

Bioinformatics
Finding Frequent Patterns in a Large Sparse Graph*

Data Mining and Knowledge Discovery
SketchTree: Approximate Tree Pattern Counts over Streaming Labeled Trees

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Exploit sequencing to accelerate hot XML query pattern mining

Proceedings of the 2006 ACM symposium on Applied computing
Answering XML Queries Using Path-Based Indexes: A Survey

World Wide Web
Incremental Mining of Frequent Query Patterns from XML Queries for Caching

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Efficient mining of XML query patterns for caching

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Substructure discovery using minimum description length and background knowledge

Journal of Artificial Intelligence Research

Bottom-up discovery of frequent rooted unordered subtrees

Information Sciences: an International Journal
An efficient algorithm of frequent XML query pattern mining for ebXML applications in e-commerce

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

A recent approach to improve the performance of XML query evaluation is to cache the query results of frequent query patterns. Unfortunately, discovering these frequent query patterns is an expensive operation. In this paper, we develop a two-pass mining algorithm 2PXMiner that guarantees the discovery of frequent query patterns by scanning the database at most twice. By exploiting a transaction summary data structure, and an enumeration tree, we are able to determine the upper bounds of the frequencies of the candidate patterns, and to quickly prune away the infrequent patterns. We also design an index to trace the repeating candidate subtrees generated by sibling repetition, thus avoiding redundant computations. Experiments results indicate that 2PXMiner is both efficient and scalable.