Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
XClust: clustering XML schemas for effective integration
Proceedings of the eleventh international conference on Information and knowledge management
Discovering Structural Association of Semistructured Data
IEEE Transactions on Knowledge and Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
Efficiently Mining Maximal Frequent Itemsets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Generalized Association Rules
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
LOGML: Log Markup Language for Web Usage Mining
WEBKDD '01 Revised Papers from the Third International Workshop on Mining Web Log Data Across All Customers Touch Points
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TreeFinder: a First Step towards XML Data Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
A Tool for Extracting XML Association Rules
ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient Data Mining for Maximal Frequent Subtrees
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
An Efficient and Scalable Algorithm for Clustering XML Documents by Structure
IEEE Transactions on Knowledge and Data Engineering
XRules: an effective structural classifier for XML data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Weighted Association Rule Mining using weighted support and significance framework
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A new sequential mining approach to XML document similarity computation
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
In the Search of NECTARs from Evolutionary Trees
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
COWES: Web user clustering based on evolutionary web sessions
Data & Knowledge Engineering
Proceedings of the VLDB Endowment
Weigted-FP-tree based XML query pattern mining
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Hi-index | 0.00 |
In the past few years, the fast proliferation of available XML documents has stimulated a great deal of interest in discovering hidden and nontrivial knowledge from XML repositories. However, to the best of our knowledge, none of existing work on XML mining has taken into account of the dynamic nature of XML documents as online information. The present article proposes a novel type of frequent pattern, namely, FRequently And Concurrently muTating substructUREs (FRACTURE), that is mined from the evolution of an XML document. A discovered FRACTURE is a set of substructures of an XML document that frequently change together. Knowledge obtained from FRACTURE is useful in applications such as XML indexing, XML clustering etc. In order to keep the result patterns concise and explicit, we further formulate the problem of maximal FRACTURE mining. Two algorithms, which employ the level-wise and divide-and-conquer strategies respectively, are designed to mine the set of FRACTUREs. The second algorithm, which is more efficient, is also optimized to discover the set of maximal FRACTUREs. Experiments involving a wide range of synthetic and real-life datasets verify the efficiency and scalability of the developed algorithms.