Exploiting Structural Information for Text Classification on the WWW

  • Authors:
  • Johannes Fürnkranz

  • Affiliations:
  • -

  • Venue:
  • IDA '99 Proceedings of the Third International Symposium on Advances in Intelligent Data Analysis
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we report on a set of experiments that explore the utility of making use of the structural information of WWW documents. Our working hypothesis is that it is often easier to classify a hypertext page using information provided on pages that point to it instead of using information that is provided on the page itself. We present experimental evidence that confirms this hypothesis on a set of Web-pages that relate to Computer Science Departments.