Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Online association rule mining
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Synopsis data structures for massive data sets
External memory algorithms
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Algorithmics and applications of tree and graph searching
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Containment and equivalence for an XPath fragment
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Characterizing memory requirements for queries over continuous data streams
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sampling from a moving window over streaming data
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
XCache: a semantic caching system for XML queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Discovering Structural Association of Semistructured Data
IEEE Transactions on Knowledge and Data Engineering
Containment for XPath Fragments under DTD Constraints
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
XPath Containment in the Presence of Disjunction, DTDs, and Variables
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Mining Frequent Quer Patterns from XML Queries
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Online Algorithms for Mining Semi-structured Data Stream
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
TreeFinder: a First Step towards XML Data Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Approximate join processing over data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Chain: operator scheduling for memory minimization in data stream systems
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Processing set expressions over continuous update streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management
The VLDB Journal — The International Journal on Very Large Data Bases
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient elastic burst detection in data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM SIGMOD Record
Comparing data streams using Hamming norms (how to zero in)
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient mining of XML query patterns for caching
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Adaptive, hands-off stream mining
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Operator scheduling in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Online mining of frequent query trees over XML data streams
Proceedings of the 15th international conference on World Wide Web
Matching subsequences in trees
Journal of Discrete Algorithms
Mining tree-structured data on multicore systems
Proceedings of the VLDB Endowment
The tree inclusion problem: In linear space and faster
ACM Transactions on Algorithms (TALG)
The tree inclusion problem: in optimal space and faster
ICALP'05 Proceedings of the 32nd international conference on Automata, Languages and Programming
Matching subsequences in trees
CIAC'06 Proceedings of the 6th Italian conference on Algorithms and Complexity
Processing global XQuery queries based on static query decomposition
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
Caching query results is one efficient approach to improving the performance of XML management systems. This entails the discovery of frequent XML queries issued by users. In this paper, we model user queries as a stream of XML query pattern trees and mine the frequent query patterns over the query stream. To facilitate the one-pass mining process, we devise a novel data structure called DTS to summarize the pattern trees seen so far. By grouping the incoming pattern trees into batches, we can dynamically mark the active portion of the current batch in DTS and limit the enumeration of candidate trees to only the currently active pattern trees. We also design another summary data structure called ECTree that provides for the incremental computation of the frequent tree patterns over the query stream. Based on the above two constructs, we present two mining algorithms called XQSMinerI and XQSMinerII. XQSMinerI is fast, but it tends to overestimate, while XQSMinerII adopts a filter-and-refine approach to minimize the amount of overestimation. Experimental results show that the proposed methods are both efficient and scalable and require only small memory footprints.