Exploit sequencing to accelerate hot XML query pattern mining

Authors:
Jianhua Feng;Qian Qian;Jianyong Wang;Lizhu Zhou
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China
Venue:
Proceedings of the 2006 ACM symposium on Applied computing
Year:
2006

Citing 19
Cited 7

FreeSpan: frequent pattern-projected sequential pattern mining

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining long sequential patterns in a noisy environment

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Cyclic Association Rules

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
The PSP Approach for Mining Sequential Patterns

PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Sequential PAttern mining using a bitmap representation

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Frequent Quer Patterns from XML Queries

DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
TreeFinder: a First Step towards XML Data Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Mining of Partial Periodic Patterns in Time Series Database

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
ViST: a dynamic index method for querying XML data by tree structures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
D(k)-index: an adaptive structural summary for graph-structured data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
PRIX: Indexing And Querying XML Using Prüfer Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
BIDE: Efficient Mining of Frequent Closed Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Efficient mining of XML query patterns for caching

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Efficient mining of frequent XML query patterns with repeating-siblings

Information and Software Technology
Efficient mining of frequent closed XML query pattern

Journal of Computer Science and Technology
Incremental sequence-based frequent query pattern mining from XML queries

Data Mining and Knowledge Discovery
Parameterized XPath views

BNCOD'07 Proceedings of the 24th British national conference on Databases
New approach for the sequential pattern mining of high-dimensional sequence databases

Decision Support Systems
Weigted-FP-tree based XML query pattern mining

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Exploit sequencing to accelerate XML twig query answering

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speeding up query evaluation in large XML repositories becomes a challenging and all-important problem with vast XML-related applications arising. Upon discovery of hot XML query patterns, indexing and caching can be effectively adopted for query performance enhancement. Previous algorithms for finding hot query patterns basically introduced a straightforward generate-and-test strategy. In this paper, we present, SOLARIA, an efficient algorithm for mining hot XML query patterns without candidate maintenance and costly tree-containment checking. Efficient algorithm of sequence mining is involved in discovering frequent tree-structured patterns, which aims at replacing expensive containment testing with cheap parent-child checking in sequences. SOLARIA deeply prunes unrelated search space for frequent pattern enumeration by parent-child relationship constraint. With the motivation of indexing and caching in XML query optimization, we also propose the derived algorithm SOLARIA for mining hot "closed" XML query patterns which provide compact and complete structure information. By a thorough experimental study on various real-life data, we demonstrate the efficiency and scalability of SOLARIA over the previous known alternative. SOLARIA is also linearly scalable in terms of XML queries' size.