FRACTURE mining: mining frequently and concurrently mutating structures from historical XML documents

Authors:
Ling Chen;Sourav S. Bhowmick;Liang-Tien Chia
Affiliations:
School of Computer Engineering, Nanyang Technological University, Singapore, Singapore;School of Computer Engineering, Nanyang Technological University, Singapore, Singapore;School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
Venue:
Data & Knowledge Engineering - Special issue: WIDM 2004
Year:
2006

Citing 19
Cited 4

Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
XClust: clustering XML schemas for effective integration

Proceedings of the eleventh international conference on Information and knowledge management
Discovering Structural Association of Semistructured Data

IEEE Transactions on Knowledge and Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases

Proceedings of the 17th International Conference on Data Engineering
Efficiently Mining Maximal Frequent Itemsets

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
LOGML: Log Markup Language for Web Usage Mining

WEBKDD '01 Revised Papers from the Third International Workshop on Mining Web Log Data Across All Customers Touch Points
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TreeFinder: a First Step towards XML Data Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Storage of XML Data

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
A Tool for Extracting XML Association Rules

ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Detecting Changes in XML Documents

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient Data Mining for Maximal Frequent Subtrees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
An Efficient and Scalable Algorithm for Clustering XML Documents by Structure

IEEE Transactions on Knowledge and Data Engineering
XRules: an effective structural classifier for XML data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Weighted Association Rule Mining using weighted support and significance framework

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A new sequential mining approach to XML document similarity computation

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining

In the Search of NECTARs from Evolutionary Trees

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
COWES: Web user clustering based on evolutionary web sessions

Data & Knowledge Engineering
NEAR-Miner: mining evolution associations of web site directories for efficient maintenance of web archives

Proceedings of the VLDB Endowment
Weigted-FP-tree based XML query pattern mining

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the past few years, the fast proliferation of available XML documents has stimulated a great deal of interest in discovering hidden and nontrivial knowledge from XML repositories. However, to the best of our knowledge, none of existing work on XML mining has taken into account of the dynamic nature of XML documents as online information. The present article proposes a novel type of frequent pattern, namely, FRequently And Concurrently muTating substructUREs (FRACTURE), that is mined from the evolution of an XML document. A discovered FRACTURE is a set of substructures of an XML document that frequently change together. Knowledge obtained from FRACTURE is useful in applications such as XML indexing, XML clustering etc. In order to keep the result patterns concise and explicit, we further formulate the problem of maximal FRACTURE mining. Two algorithms, which employ the level-wise and divide-and-conquer strategies respectively, are designed to mine the set of FRACTUREs. The second algorithm, which is more efficient, is also optimized to discover the set of maximal FRACTUREs. Experiments involving a wide range of synthetic and real-life datasets verify the efficiency and scalability of the developed algorithms.