A Text Mining Approach on Automatic Generation of Web Directories and Hierarchies

  • Authors:
  • Hsin-Chang Yang;Chung-Hong Lee

  • Affiliations:
  • -;-

  • Venue:
  • WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are enormous amount of web pages in the world. Retrieval of required information from the WWW is thus an arduous task. Different models for retrieving web pages have been used by the WWW community. One of the most widely used model is by traversing a predefined web directory hierarchy to reach a user's goal. The web directories are compiled or classified folders of web pages and are usually organized into a hierarchical structure. The classificationof web pages into proper directories and the organization of directory hierarchies are generally performed by human experts. In this work, we provide a method to apply a kind of text mining techniques on a set of web pages to automatically create web directories and organize them into hierarchies. The method is based on the self-organizing map learning algorithm and requires no human intervention during the construction of web directories and hierarchies. Theexperiments show that our method can produce comprehensible and reasonable web directories and hierarchies.