Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Change detection in hierarchically structured information
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
The String-to-String Correction Problem
Journal of the ACM (JACM)
The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
XTRACT: a system for extracting document type descriptors from XML documents
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Information Retrieval
Comparing Hierarchical Data in External Memory
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Learning-based summarisation of XML documents
Information Retrieval
Xproj: a framework for projected structural clustering of xml documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
An Effective Data Processing Method for Fast Clustering
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Efficient SOAP message exchange and evaluation through XML similarity
Proceedings of the 2008 ACM workshop on Secure web services
Discovering unexpected documents in corpora
Knowledge-Based Systems
A Bloom Filter Based Approach for Evaluating Structural Similarity of XML Documents
WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
Return specification inference and result clustering for keyword search on XML
ACM Transactions on Database Systems (TODS)
An effective detection method for clustering similar XML DTDs using tag sequences
ICCSA'07 Proceedings of the 2007 international conference on Computational science and Its applications - Volume Part II
Improving XML search by generating and utilizing informative result snippets
ACM Transactions on Database Systems (TODS)
Structure and content similarity for clustering XML documents
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
Clust-XPaths: clustering of XML paths
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
WSEAS Transactions on Computers
An approach for clustering semantically heterogeneous XML schemas
OTM'05 Proceedings of the 2005 Confederated international conference on On the Move to Meaningful Internet Systems - Volume >Part I
A flexible structured-based representation for XML document mining
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Sequential pattern mining for structure-based XML document classification
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Approximate top-k structural similarity search over XML documents
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
XML clustering based on common neighbor
APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
Discovering semantic sibling associations from web documents with XTREEM-SP
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Discovering semantic sibling groups from web documents with XTREEM-SG
EKAW'06 Proceedings of the 15th international conference on Managing Knowledge in a World of Networks
Clustering XML documents by structure
ADBIS'09 Proceedings of the 13th East European conference on Advances in Databases and Information Systems
FXProj: a fuzzy XML documents projected clustering based on structure and content
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Hi-index | 0.00 |
This work presents a methodology for grouping structurally similar XML documents using clustering algorithms Modeling XML documents with tree-like structures, we face the ‘clustering XML documents by structure' problem as a ‘tree clustering' problem, exploiting distances that estimate the similarity between those trees in terms of the hierarchical relationships of their nodes We suggest the usage of tree structural summaries to improve the performance of the distance calculation and at the same time to maintain or even improve its quality Experimental results are provided using a prototype testbed.