EXiT-B: a new approach for extracting maximal frequent subtrees from XML data

Authors:
Juryon Paik;Dongho Won;Farshad Fotouhi;Ung Mo Kim
Affiliations:
Department of Computer Engineering, Sungkyunkwan University, Suwon, Gyeonggi-do, Republic of Korea;Department of Computer Engineering, Sungkyunkwan University, Suwon, Gyeonggi-do, Republic of Korea;Wayne State University, Detroit, MI;Department of Computer Engineering, Sungkyunkwan University, Suwon, Gyeonggi-do, Republic of Korea
Venue:
IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Year:
2005

Citing 8
Cited 5

Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Optimized Substructure Discovery for Semi-structured Data

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Frequent Tag Tree Patterns in Semistructured Web Documents

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TreeFinder: a First Step towards XML Data Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
EFoX: a scalable method for extracting frequent subtrees

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part III

Discovery of Useful Patterns from Tree-Structured Documents with Label-Projected Database

ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
Process of applying data mining techniques to XML data

Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Extraction of interesting financial information from heterogeneous XML-Based data

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Extraction of implicit context information in ubiquitous computing environments

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part IV
A simple yet efficient approach for maximal frequent subtrees extraction from a collection of XML documents

WISE'06 Proceedings of the 7th international conference on Web Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Along with the increasing amounts of XML data available, the data mining community has been motivated to discover the useful information from the collections of XML documents. One of the most popular approaches to find the information is to extract frequent subtrees from a set of XML trees. In this paper, we propose a novel algorithm, EXiT-B, for efficiently extracting maximal frequent subtrees from a set of XML documents. The main contribution of our algorithm is that there is no need to perform tree join operation during the phase of generating maximal frequent subtrees. Thus, the task of finding maximal frequent subtrees can be significantly simplified comparing to the previous approaches.