Extracting schema from semistructured data
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data on the Web: from relations to semistructured data and XML
Data on the Web: from relations to semistructured data and XML
Discovering Structural Association of Semistructured Data
IEEE Transactions on Knowledge and Data Engineering
Optimizing Regular Path Expressions Using Graph Schemas
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
KD-FGS: A Knowledge Discovery System from Graph Data Using Formal Graph System
PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Polynomial Time Matching Algorithms for Tree-Like Structured Patterns in Knowledge Discovery
PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Optimized Substructure Discovery for Semi-structured Data
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Extracting Characteristic Structures among Words in Semistructured Documents
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Discovery of Frequent Tag Tree Patterns in Semistructured Web Documents
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Ordered Term Tree Languages which Are Polynomial Time Inductively Inferable from Positive Data
ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
Polynomial Time Algorithms for Finding Unordered Tree Patterns with Internal Variables
FCT '01 Proceedings of the 13th International Symposium on Fundamentals of Computation Theory
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Recommending structure in collaborative semistructured information systems
Proceedings of the fourth ACM conference on Recommender systems
Mining frequent trees based on topology projection
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Sequential pattern mining for structure-based XML document classification
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Hi-index | 0.00 |
Many documents such as Web documents or XML files have no rigid structure. Such semistructured documents have been rapidly increasing. We propose a new method for discovering frequent tree structured patterns in semistructured Web documents. We consider the data mining problem of finding all maximally frequent tag tree patterns in semistructured data such as Web documents. A tag tree pattern is an edge labeled tree which has hyperedges as variables. An edge label is a tag or a keyword inWeb documents, and a variable can be substituted by any tree. So a tag tree pattern is suited for representing tree structured patterns in semistructured Web documents. We present an algorithm for finding all maximally frequent tag tree patterns. Also we report some experimental results on XML documents by using our algorithm.