Mining flexible association rules from XML

Authors:
Elisabetta Caneva;Barbara Oliboni;Elisa Quintarelli
Affiliations:
Univ. degli Studi di Verona, Italy;Univ. degli Studi di Verona, Italy;Politecnico di Milano, Italy
Venue:
Proceedings of the 2009 EDBT/ICDT Workshops
Year:
2009

Citing 9
Cited 0

Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Association Rules from XML Data

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Data Mining for Maximal Frequent Subtrees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Discovering interesting information in XML data with association rules

Proceedings of the 2003 ACM symposium on Applied computing
DRYADE: A New Approach for Discovering Closed Frequent Trees in Heterogeneous Tree Databases

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Efficiently Mining Frequent Embedded Unordered Trees

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Answering XML queries by means of data summaries

ACM Transactions on Information Systems (TOIS)
DryadeParent, An Efficient and Robust Closed Attribute Tree Mining Algorithm

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The role of the eXtensible Markup Language (XML) is becoming very important in the research fields focusing on the representation, the exchange, and the integration of information coming from different data sources and containing information related to various contexts such as, for example, medical and biological data. Extracting knowledge from XML datasets is an important issue that may be difficult because of the semistructured intrinsic nature of XML; indeed documents can have an implicit and irregular structure, not defined in advance. In this paper, we propose a novel approach for discovering frequent, but approximate, information in XML documents, based on Flexible Tree Rules taking into account both structure and content of the analyzed data. Our proposal is flexible enough to be adapted to both documents with a regular structure and documents with a highly heterogeneous structure, and can be used to evaluate the similarity of XML documents. Moreover, we describe an algorithm to evaluate the similarity degree of a Flexible Tree Rule with respect to an XML document.