Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Finding context paths for Web pages
Proceedings of the tenth ACM Conference on Hypertext and hypermedia : returning to our diverse roots: returning to our diverse roots
The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
New algorithm for ordered tree-to-tree correction problem
Journal of Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
A Study of Approaches to Hypertext Categorization
Journal of Intelligent Information Systems
The connectivity sonar: detecting site functionality by structural patterns
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Local Similarity in RNA Secondary Structures
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Introduction to the special issue on the web as corpus
Computational Linguistics - Special issue on web as corpus
International Journal of Human-Computer Studies
The shared corpora working group report
LAW '07 Proceedings of the Linguistic Annotation Workshop
Best-effort semantic document search on GPUs
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Hi-index | 0.00 |
In this paper we present an approach to structure learning in the area of web documents. This is done in order to approach the goal of webgenre tagging in the area of web corpus linguistics. A central outcome of the paper is that purely structure oriented approaches to web document classification provide an information gain which may be utilized in combined approaches of web content and structure analysis.