A structure preserving flat data format representation for tree-structured data

Authors:
Fedja Hadzic
Affiliations:
DEBII, Curtin University, Perth, Australia
Venue:
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Year:
2011

Citing 13
Cited 3

C4.5: programs for machine learning

C4.5: programs for machine learning
Optimized Substructure Discovery for Semi-structured Data

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Discovering interesting information in XML data with association rules

Proceedings of the 2003 ACM symposium on Applied computing
XRules: an effective structural classifier for XML data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets

Data Mining and Knowledge Discovery
TRIPS and TIDES: new algorithms for tree mining

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Frequent Subtree Mining - An Overview

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Tree model guided candidate generation for mining frequent subtrees from XML documents

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining structured data

IEEE Computational Intelligence Magazine
NDPMine: efficiently mining discriminative numerical features for pattern-based classification

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Mining of Data with Complex Structures

Mining of Data with Complex Structures
IMB3-Miner: mining induced/embedded subtrees by constraining the level of embedding

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Alternative Approach to Tree-Structured Web Log Representation and Mining

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
A framework for application of tree-structured data mining to process log analysis

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Application of tree-structured data mining for analysis of process logs in XML format

AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining of semi-structured data such as XML is a popular research topic due to many useful applications. The initial work focused mainly on values associated with tags, while most of recent developments focus on discovering association rules among tree structured data objects to preserve the structural information. Other data mining techniques have had limited use in tree-structured data analysis as they were mainly designed to process flat data format with no need to capture the structural properties of data objects. This paper proposes a novel structure-preserving way for representing tree-structured document instances as records in a standard flat data structure to enable applicability of a wider range of data analysis techniques. The experiments using synthetic and real world data demonstrate the effectiveness of the proposed approach.