Query based optimal web site clustering using simulated annealing

  • Authors:
  • Wookey Lee;Young Kuk Kim;Bok Sik Yoon;Jiang Jin Xi

  • Affiliations:
  • Inha University, Incheon, Korea;Chungnam National University, Daejeon, Korea;Hongik University, Seoul, Korea;Yanbian University, Jinlin Province, China

  • Venue:
  • Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is a viable technique to deal with the scaling issue for the web documents, which has been known for complicated combinatorial optimization problem. It is hard to develop a generally applicable optimal algorithm on the web document clustering and classification for which a simulated annealing algorithm is developed. The web document classification problem is addressed as the problem of best describing match between a web query and a hypothesized web object. The normalized term frequency and inverse document frequency coefficient is used as a measure of the match. Test beds are generated on-line during the search by transforming web sites. As a result, web sites can be clustered optimally in terms of keyword vectors of corresponding web documents.