Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Association Rules from XML Data
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Data Mining for Maximal Frequent Subtrees
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Discovering interesting information in XML data with association rules
Proceedings of the 2003 ACM symposium on Applied computing
DRYADE: A New Approach for Discovering Closed Frequent Trees in Heterogeneous Tree Databases
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Efficiently Mining Frequent Embedded Unordered Trees
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Answering XML queries by means of data summaries
ACM Transactions on Information Systems (TOIS)
DryadeParent, An Efficient and Robust Closed Attribute Tree Mining Algorithm
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
The role of the eXtensible Markup Language (XML) is becoming very important in the research fields focusing on the representation, the exchange, and the integration of information coming from different data sources and containing information related to various contexts such as, for example, medical and biological data. Extracting knowledge from XML datasets is an important issue that may be difficult because of the semistructured intrinsic nature of XML; indeed documents can have an implicit and irregular structure, not defined in advance. In this paper, we propose a novel approach for discovering frequent, but approximate, information in XML documents, based on Flexible Tree Rules taking into account both structure and content of the analyzed data. Our proposal is flexible enough to be adapted to both documents with a regular structure and documents with a highly heterogeneous structure, and can be used to evaluate the similarity of XML documents. Moreover, we describe an algorithm to evaluate the similarity degree of a Flexible Tree Rule with respect to an XML document.