Algorithms for clustering data
Algorithms for clustering data
WordNet: a lexical database for English
Communications of the ACM
Lore: a database management system for semistructured data
ACM SIGMOD Record
Storing semistructured data with STORED
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Data on the Web: from relations to semistructured data and XML
Data on the Web: from relations to semistructured data and XML
Modern Information Retrieval
XClust: clustering XML schemas for effective integration
Proceedings of the eleventh international conference on Information and knowledge management
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
A semi-structured document model for text mining
Journal of Computer Science and Technology
BitCube: A Three-Dimensional Bitmap Indexing for XML Documents
Journal of Intelligent Information Systems
Tamino - A DBMS designed for XML
Proceedings of the 17th International Conference on Data Engineering
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Relational Databases for Querying XML Documents: Limitations and Opportunities
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Storing and Querying XML Data in Object-Relational DBMSs
EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
The VLDB Journal — The International Journal on Very Large Data Bases
Anatomy of a native XML base management system
The VLDB Journal — The International Journal on Very Large Data Bases
An information-theoretic approach to normal forms for relational and XML data
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
An Efficient and Scalable Algorithm for Clustering XML Documents by Structure
IEEE Transactions on Knowledge and Data Engineering
A normal form for XML documents
ACM Transactions on Database Systems (TODS)
Organizing structured web sources by query schemas: a clustering approach
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Fast Detection of XML Structural Similarity
IEEE Transactions on Knowledge and Data Engineering
A tree-based approach to clustering XML documents by structure
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Integrating Element and Term Semantics for Similarity-Based XML Document Clustering
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Versatile structural disambiguation for semantic-aware applications
Proceedings of the 14th ACM international conference on Information and knowledge management
Introduction to the special issue on XML retrieval
ACM Transactions on Information Systems (TOIS)
Structure and value synopses for XML data graphs
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
SenseRelate targetword: a generalized framework for word sense disambiguation
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
Graph connectivity measures for unsupervised word sense disambiguation
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Extended gloss overlaps as a measure of semantic relatedness
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Using measures of semantic relatedness for word sense disambiguation
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Semantics-guided clustering of heterogeneous XML schemas
Journal on data semantics IX
XCLS: a fast and effective clustering algorithm for heterogenous XML documents
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
An approach for clustering semantically heterogeneous XML schemas
OTM'05 Proceedings of the 2005 Confederated international conference on On the Move to Meaningful Internet Systems - Volume >Part I
A flexible structured-based representation for XML document mining
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Transforming XML trees for efficient classification and clustering
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Scaling up top-K cosine similarity search
Data & Knowledge Engineering
Collaborative clustering of XML documents
Journal of Computer and System Sciences
Finding association rules in semantic web data
Knowledge-Based Systems
Building data warehouses with semantic web data
Decision Support Systems
XML document clustering using structure-preserving flat representation of XML content and structure
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Exploring dictionary-based semantic relatedness in labeled tree data
Information Sciences: an International Journal
A Knowledge Mining Approach for Effective Customer Relationship Management
International Journal of Knowledge-Based Organizations
Semantic to intelligent web era: building blocks, applications, and current trends
Proceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems
Hi-index | 0.00 |
Dealing with structure and content semantics underlying semistructured documents is challenging for any task of document management and knowledge discovery conceived for such data. In this work we address the novel problem of clustering semantically related XML documents according to their structure and content features. XML features are generated by enriching syntactic with semantic information based on a lexical knowledge base. The backbone of the proposed framework for the semantic clustering of XML documents is a data representation model that exploits the notion of tree tuple to identify semantically cohesive substructures in XML documents and represent them as transactional data. This framework is equipped with two clustering algorithms based on different paradigms, namely centroid-based partitional clustering and frequent-itemset-based hierarchical clustering. An extensive experimental evaluation was conducted on real data sets from various domains, showing the significance of our approach as a solution for the semantic clustering of XML documents.