A statistical interestingness measures for XML based association rules

Authors:
Izwan Nizal Mohd Shaharanee;Fedja Hadzic;Tharam S. Dillon
Affiliations:
Digital Ecosystem and Business Intelligence Institute, Curtin University of Technology, Perth, Australia;Digital Ecosystem and Business Intelligence Institute, Curtin University of Technology, Perth, Australia;Digital Ecosystem and Business Intelligence Institute, Curtin University of Technology, Perth, Australia
Venue:
PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Year:
2010

Citing 30
Cited 0

A Statistical-Heuristic Feature Selection Criterion for Decision Tree Induction

IEEE Transactions on Pattern Analysis and Machine Intelligence
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A new framework for itemset generation

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Data mining: concepts and techniques

Data mining: concepts and techniques
Constraint-Based Rule Mining in Large, Dense Databases

Data Mining and Knowledge Discovery
A Statistical Theory for Quantitative Association Rules

Journal of Intelligent Information Systems
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Rule Evaluation Measures: A Unifying View

ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining association rules on significant rare data using relative support

Journal of Systems and Software
Screening and interpreting multi-item associations based on log-linear modeling

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
XRules: an effective structural classifier for XML data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
Hyperclique pattern discovery

Data Mining and Knowledge Discovery
Reducing the Frequent Pattern Set

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Extracting Variable Knowledge from Multiversioned XML Documents

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Mining Substructures in Protein Data

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Discovering Significant Patterns

Machine Learning
Knowledge Analysis with Tree Patterns

HICSS '08 Proceedings of the Proceedings of the 41st Annual Hawaii International Conference on System Sciences
Data Analysis in the 21st Century

Statistical Analysis and Data Mining
Mining significant graph patterns by leap search

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Tree model guided candidate generation for mining frequent subtrees from XML documents

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining significant tree patterns in carbohydrate sugar chains

Bioinformatics
Mining Unordered Distance-Constrained Embedded Subtrees

DS '08 Proceedings of the 11th International Conference on Discovery Science
Mining Mutually Dependent Ordered Subtrees in Tree Databases

New Frontiers in Applied Data Mining
Interestingness of Association Rules Using Symmetrical Tau and Logistic Regression

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
IMB3-Miner: mining induced/embedded subtrees by constraining the level of embedding

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Mining frequent trees with node-inclusion constraints

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Frequent Subtree Mining - An Overview

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently mining frequent substructures from XML data has gained a considerable amount of interest. Different methods have been proposed and examined for mining frequent patterns from XML documents efficiently and effectively. While many frequent XML patterns generated are useful and interesting, it is common that a large portion of them is not considered as interesting or significant for the application at hand. In this paper, we present a systematic approach to ascertain whether the discovered XML patterns are significant and not just coincidental associations, and provide a precise statistical approach to support this framework. The proposed strategy combines data mining and statistical measurement techniques to discard the non significant patterns. In this paper we considered the "Prions" database that describes the protein instances stored for Human Prions Protein. The proposed unified framework is applied on this dataset to demonstrate its effectiveness in assessing interestingness of discovered XML patterns by statistical means. When the dataset is used for classification/prediction purposes, the proposed approach will discard non significant XML patterns, without the cost of a reduction in the accuracy of the pattern set as a whole.