A Unified Framework for Clustering Heterogeneous Web Objects

  • Authors:
  • Hua-Jun Zeng;Zheng Chen;Wei-Ying Ma

  • Affiliations:
  • -;-;-

  • Venue:
  • WISE '02 Proceedings of the 3rd International Conference on Web Information Systems Engineering
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we introduce a novel framework forclustering web data which is often heterogeneous innature. As most existing methods often integrateheterogeneous data into a unified feature space, theirflexibilities to explore and adjust contributing effect fromdifferent heterogeneous information are compromised. Incontrast, our framework enables separate clustering ofhomogeneous data in the entire process based on theirrespective features, and a layered structure with linkinformation is used to iteratively project and propagatethe clustered results between layers until it converges. Ourexperimental results show that such a scheme not onlyeffectively overcomes the problem of data sparsenesscaused by the high dimensional link space but alsoimproves the clustering accuracy significantly. We achieve19% and 41% performance increases when clusteringweb-pages and users based on a semi-synthetic web log.Finally, we show a real clustering result based on UCBerkeley's web log.