Frequent pattern-growth approach for document organization
Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
An Abstraction-Based Data Model for Information Retrieval
AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Duplicate candidate elimination and fast support calculation for frequent subgraph mining
IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
A new algorithm for mining frequent connected subgraphs based on adjacency matrices
Intelligent Data Analysis
Full duplicate candidate pruning for frequent connected subgraph mining
Integrated Computer-Aided Engineering
Semantically-guided clustering of text documents via frequent subgraphs discovery
ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
Parallel structural graph clustering
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Frequent approximate subgraphs as features for graph-based image classification
Knowledge-Based Systems
A novel approach for clustering sentiments in Chinese blogs based on graph similarity
Computers & Mathematics with Applications
Abstracting for Dimensionality Reduction in Text Classification
International Journal of Intelligent Systems
Hi-index | 0.00 |
This paper introduces a new technique of document clustering based on frequent senses. The proposed system, GDClust (Graph-Based Document Clustering) works with frequent senses rather than frequent keywords used in traditional text mining techniques. GDClust presents text documents as hierarchical document-graphs and utilizes an Apriori paradigm to find the frequent subgraphs, which reflect frequent senses. Discovered frequent subgraphs are then utilized to generate sense-based document clusters. We propose a novel multilevel Gaussian minimum support approach for candidate subgraph generation. GDClust utilizes English language ontology to construct document-graphs and exploits graph-based data mining technique for sense discovery and clustering. It is an automated system and requires minimal human interaction for the clustering purpose.