Preparations for Semantics-Based XML Mining

Authors:
Jung-Won Lee;Kiho Lee;Won Kim
Affiliations:
-;-;-
Venue:
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Year:
2001

Citing 0
Cited 15

The Chamois Component-Based Knowledge Engineering Framework

Computer
On the use of hierarchical information in sequential mining-based XML document similarity computation

Knowledge and Information Systems
Discovering Relations Among Entities from XML Documents

MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Support for seamless data exchanges between web services through information mapping analysis using kernel methods

Expert Systems with Applications: An International Journal
Semantic Web Mining

Web Semantics: Science, Services and Agents on the World Wide Web
A new sequential mining approach to XML document similarity computation

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
A kernel method for measuring structural similarity between XML documents

IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
Evaluate structure similarity in XML documents with merge-edit-distance

PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Similarity computation for XML documents by XML element sequence patterns

APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
A complete path representation method with a modified inverted index for efficient retrieval of XML documents

WSEAS Transactions on Computers
Finding maximal similar paths between XML documents using sequential patterns

ADVIS'04 Proceedings of the Third international conference on Advances in Information Systems
A new sequential mining approach to XML document clustering*

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Clustering and retrieval of XML documents by structure

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part II
Using fuzzy cognitive map to effectively classify e-documents and application

GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
XMine: a methodology for mining XML structure

APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

XML allows users to define elements using arbitrary words and organize them in a nested structure. These features of XML offer both challenges and opportunities in information retrieval, document management, and data mining. In this paper,we propose a new methodology for preparing XML documents for quantitative determination of similarity between XML documents by taking account of XML semantics (i.e.,meanings of the elements andnested structures of XML documents).Accurate quantitative determination of similarity between XML documents provides an important basis for a variety of applications of XML document mining and processing. Experiments with XML documents show that ourmethodology provides a 50-100%improvement in determining similarity, over the traditional vector-space model that considers only term-frequency and 100% accuracy in identifying the category of each document from an on-line bookstore.