Matrix multiplication via arithmetic progressions
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Lore: a database management system for semistructured data
ACM SIGMOD Record
A graph distance metric based on the maximal common subgraph
Pattern Recognition Letters
Storing semistructured data with STORED
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Relational Databases for Querying XML Documents: Limitations and Opportunities
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Querying and Updating the File
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Fast Detection of XML Structural Similarity
IEEE Transactions on Knowledge and Data Engineering
Peer-to-peer management of XML data: issues and research challenges
ACM SIGMOD Record
Web data extraction based on structural similarity
Knowledge and Information Systems
A methodology for clustering XML documents by structure
Information Systems
Data & Knowledge Engineering - Special issue: WIDM 2004
XML structural delta mining: issues and challenges
Data & Knowledge Engineering - Special issue: ER 2003
A multidimensional scaling approach for representing XML documents
ACM-SE 45 Proceedings of the 45th annual southeast regional conference
Xproj: a framework for projected structural clustering of xml documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
An agent framework for recommendation
TELE-INFO'07 Proceedings of the 6th WSEAS Int. Conference on Telecommunications and Informatics
A heuristic algorithm for clustering rooted ordered trees
Intelligent Data Analysis
Similarity Measurement of XML Documents Based on Structure and Contents
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Multilevel Conditional Fuzzy C-Means Clustering of XML Documents
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Combining Web Usage Mining and XML Mining in a Real Case Study
From Web to Social Web: Discovering and Deploying User and Content Profiles
Document Clustering Using Incremental and Pairwise Approaches
Focused Access to XML Documents
An Effective Data Processing Method for Fast Clustering
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Expert Systems with Applications: An International Journal
On Finding Templates on Web Collections
World Wide Web
In the Search of NECTARs from Evolutionary Trees
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
COWES: Web user clustering based on evolutionary web sessions
Data & Knowledge Engineering
Data Discovery and Related Factors of Documents on the Web and the Network
ICCSA '09 Proceedings of the International Conference on Computational Science and Its Applications: Part I
A system for detecting xml similarity in content and structure using relational database
Proceedings of the 18th ACM conference on Information and knowledge management
Semantic clustering of XML documents
ACM Transactions on Information Systems (TOIS)
A methodology for clustering XML documents by structure
Information Systems
Return specification inference and result clustering for keyword search on XML
ACM Transactions on Database Systems (TODS)
A structure-based clustering on LDAP directory information
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Semantics-guided clustering of heterogeneous XML schemas
Journal on data semantics IX
An effective detection method for clustering similar XML DTDs using tag sequences
ICCSA'07 Proceedings of the 2007 international conference on Computational science and Its applications - Volume Part II
Improving XML search by generating and utilizing informative result snippets
ACM Transactions on Database Systems (TODS)
GRAMS3: an efficient framework for XML structural similarity search
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Highly efficient algorithms for structural clustering of large websites
Proceedings of the 20th international conference on World wide web
XML data clustering: An overview
ACM Computing Surveys (CSUR)
A Clustering-Driven LDAP Framework
ACM Transactions on the Web (TWEB)
A model for complex tree integration tasks
ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part I
XStreamCluster: an efficient algorithm for streaming XML data clustering
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
A cluster-based approach to web adaptation in context-aware applications
Journal of Web Engineering
Collaborative clustering of XML documents
Journal of Computer and System Sciences
Clust-XPaths: clustering of XML paths
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
WSEAS Transactions on Computers
COWES: clustering web users based on historical web sessions
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
A flexible structured-based representation for XML document mining
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Sequential pattern mining for structure-based XML document classification
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Transforming XML trees for efficient classification and clustering
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Clustering OWL documents based on semantic analysis
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Workflow clustering method based on process similarity
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
Web Semantics: Science, Services and Agents on the World Wide Web
Mining positive and negative association rules from XML query patterns for caching
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Clustering XML documents by structure
ADBIS'09 Proceedings of the 13th East European conference on Advances in Databases and Information Systems
XML document clustering using structure-preserving flat representation of XML content and structure
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Decision support in e-business based on assessing similarities between ontologies
Knowledge-Based Systems
Survey: An overview on XML similarity: Background, current trends and future directions
Computer Science Review
Measuring structural similarity of semistructured data based on information-theoretic approaches
The VLDB Journal — The International Journal on Very Large Data Bases
Exploring dictionary-based semantic relatedness in labeled tree data
Information Sciences: an International Journal
X-Class: Associative Classification of XML Documents by Structure
ACM Transactions on Information Systems (TOIS)
Hierarchical clustering of XML documents focused on structural components
Data & Knowledge Engineering
Combining structure and content similarities for XML document clustering
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
Information Systems
Hi-index | 0.00 |
Abstract--With the standardization of XML as an information exchange language over the net, a huge amount of information is formatted in XML documents. In order to analyze this information efficiently, decomposing the XML documents and storing them in relational tables is a popular practice. However, query processing becomes expensive since, in many cases, an excessive number of joins is required to recover information from the fragmented data. If a collection consists of documents with different structures (for example, they come from different DTDs), mining clusters in the documents could alleviate the fragmentation problem. We propose a hierarchical algorithm (S-GRACE) for clustering XML documents based on structural information in the data. The notion of structure graph (s-graph) is proposed, supporting a computationally efficient distance metric defined between documents and sets of documents. This simple metric yields our new clustering algorithm which is efficient and effective, compared to other approaches based on tree-edit distance. Experiments on real data show that our algorithm can discover clusters not easily identified by manual inspection.