A Distributed Hierarchical Clustering System for Web Mining

Authors:
Catherine W. Wen;Huan Liu;Wilson X. Wen;Jeffery Zheng
Affiliations:
-;-;-;-
Venue:
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Year:
2001

Citing 8
Cited 2

Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
WALRUS: a similarity retrieval algorithm for image databases

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Web mining research: a survey

ACM SIGKDD Explorations Newsletter
Modern Information Retrieval

Modern Information Retrieval
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Web usage mining: discovery and applications of usage patterns from Web data

ACM SIGKDD Explorations Newsletter
Automatic Image Indexing for Rapid Content-Based Retrieval

IW-MMDBMS '96 Proceedings of the 1996 International Workshop on Multi-Media Database Management Systems (IW-MMDBMS '96)
On competitive learning

IEEE Transactions on Neural Networks

iJADE content management system (CMS): an intelligent multi-agent based content management system with chaotic copyright protection scheme

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
iJADE reporter: an intelligent multi-agent based context aware news reporting system

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I

Quantified Score

Hi-index	0.05

Visualization

Abstract

This paper proposes a novel method of distributed hierarchical clustering for Web mining. The method is closely related to our early work of Self-Generated Neural Networks (SGNN), which is in turn based on both self-organizing neural network and concept formation. The complexity of the algorithm is at most O(MNlogN). With the distributed implementation the method can be easily scaled up. The method is independent of the order the web documents presented. The method produces a natural conceptual hierarchy but not a binary tree. The method can include multimedia information into the same cluster hierarchy. A visualization mechanism has been developed for the clustering method and it shows the cluster hierarchy generated by the method has very high quality. The clustering process is fully automatic, and no human intervention is required. A clustering system has been built based on the proposed method, which can be used to automatically generate multimedia search engines, web directories, decision-making assistance systems, knowledge management systems, and personalized knowledge portals.