Web document classification using modified decision trees

  • Authors:
  • Wen-Chen Hu;Kai-Hsiung Chang;Gerhard X. Ritter

  • Affiliations:
  • Auburn University, Auburn, AL;Auburn University, Auburn, AL;University of Florida, Gainesville, FL

  • Venue:
  • ACM-SE 38 Proceedings of the 38th annual on Southeast regional conference
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Searching for Web pages is one of the most common tasks performed on the Web while Web page classification is the first step for Web search service construction. This paper proposes a method for classifying Web documents by using a height-three modified decision tree which splits the root, depth-one nodes, and depth-two nodes based on keywords, descriptions, and hyperlinks, respectively. A classification starts with a Web page at the root of the decision tree and traces paths downward to leaves, which give the categories of the page.