Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Document retrieval: A structural approach
Information Processing and Management: an International Journal
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
Artificial intelligence: a modern approach
Artificial intelligence: a modern approach
A New Algorithm for Error-Tolerant Subgraph Isomorphism Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
A graph distance metric based on the maximal common subgraph
Pattern Recognition Letters
Graph distances using graph union
Pattern Recognition Letters
A graph distance metric combining maximum common subgraph and minimum common supergraph
Pattern Recognition Letters
Machine Learning
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques
Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
Logical Labeling of Document Images Using Layout Graph Matching with Adaptive Learning
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Direct Construction of Compact Directed Acyclic Word Graphs
CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
A Fuzzy Classification Based on Feature Selection for Web Pages
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
eMailSift: Email Classification Based on Structure and Content
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
A Novel Context Matching Based Technique for Web Document Retrieval
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Graph-based text representation and knowledge discovery
Proceedings of the 2007 ACM symposium on Applied computing
Classification of Web Documents Using a Graph-Based Model and Structural Patterns
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
A new dual wing harmonium model for document retrieval
Pattern Recognition
Expert Systems with Applications: An International Journal
A novel dual wing harmonium model aided by 2-D wavelet transform subbands for document data mining
Expert Systems with Applications: An International Journal
On graphs with unique node labels
GbRPR'03 Proceedings of the 4th IAPR international conference on Graph based representations in pattern recognition
Graph-based tools for data mining and machine learning
MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
An efficient ontology-based expert peering system
GbRPR'07 Proceedings of the 6th IAPR-TC-15 international conference on Graph-based representations in pattern recognition
GbRPR'07 Proceedings of the 6th IAPR-TC-15 international conference on Graph-based representations in pattern recognition
IEEE Transactions on Fuzzy Systems
Relation-Based document retrieval for biomedical literature databases
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Relation-Based document retrieval for biomedical IR
Transactions on Computational Systems Biology V
Discovering and analyzing multi-granular web search results
FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Modeling the flow and change of information on the web
Proceedings of the 21st international conference companion on World Wide Web
Text Categorization of Biomedical Data Sets Using Graph Kernels and a Controlled Vocabulary
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
In this paper we describe work relating toclassification of web documents using a graph-basedmodel instead of the traditional vector-based model fordocument representation. We compare the classificationaccuracy of the vector model approach using the k-Nearest Neighbor (k-NN) algorithm to a novel approachwhich allows the use of graphs for documentrepresentation in the k-NN algorithm. The proposedmethod is evaluated on three different web documentcollections using the leave-one-out approach formeasuring classification accuracy. The results show thatthe graph-based k-NN approach can outperformtraditional vector-based k-NN methods in terms of bothaccuracy and execution time.