Efficient mining of weighted association rules (WAR)
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Association Rules with Weighted Items
IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Frequent Sub-Structure-Based Approaches for Classifying Chemical Compounds
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Weighted Association Rule Mining using weighted support and significance framework
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Obtaining Best Parameter Values for Accurate Classification
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Graph-theoretic techniques for web content mining
Graph-theoretic techniques for web content mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Bidirectional inference with the easiest-first strategy for tagging sequence data
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Frequent Subtree Mining - An Overview
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Structure-sensitive learning of text types
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Fast categorization of web documents represented by graphs
WebKDD'06 Proceedings of the 8th Knowledge discovery on the web international conference on Advances in web mining and web usage analysis
Term graph model for text classification
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Frequent sub-graph mining on edge weighted graphs
DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Multimedia data mining: state of the art and challenges
Multimedia Tools and Applications
Community detection based on a semantic network
Knowledge-Based Systems
Semantic search in the World News domain using automatically extracted metadata files
Knowledge-Based Systems
Frequent approximate subgraphs as features for graph-based image classification
Knowledge-Based Systems
Vector space model for patent documents with hierarchical class labels
Journal of Information Science
A new proposal for graph classification using frequent geometric subgraphs
Data & Knowledge Engineering
CoBAn: A context based model for data leakage prevention
Information Sciences: an International Journal
Hi-index | 0.00 |
A graph-based approach to document classification is described in this paper. The graph representation offers the advantage that it allows for a much more expressive document encoding than the more standard bag of words/phrases approach, and consequently gives an improved classification accuracy. Document sets are represented as graph sets to which a weighted graph mining algorithm is applied to extract frequent subgraphs, which are then further processed to produce feature vectors (one per document) for classification. Weighted subgraph mining is used to ensure classification effectiveness and computational efficiency; only the most significant subgraphs are extracted. The approach is validated and evaluated using several popular classification algorithms together with a real world textual data set. The results demonstrate that the approach can outperform existing text classification algorithms on some dataset. When the size of dataset increased, further processing on extracted frequent features is essential.