H-BayesClust: A New Hierarchical Clustering Based on Bayesian Networks

Authors:
Morteza Haghir Chehreghani;Hassan Abolhassani
Affiliations:
Department of Computer Engineering, Sharif University of Technology, Tehran, Iran;Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
Venue:
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Year:
2007

Citing 10
Cited 0

Fast and effective text mining using linear-time document clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Hierarchically Classifying Documents Using Very Few Words

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Link Based Clustering of Web Search Results

WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Link mining: a new data mining challenge

ACM SIGKDD Explorations Newsletter
Text Classification by Boosting Weak Learners based on Terms and Concepts

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Active semi-supervised fuzzy clustering for image database categorization

Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Semantic Web Mining

Web Semantics: Science, Services and Agents on the World Wide Web
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering is one of the most important approaches for mining and extracting knowledge from the web. In this paper a method for clustering the web data is presented which using a Bayesian network, finds appropriate representatives for each of the clusters. Having those representatives, we can create more accurate clusters. Also the contents of the web pages are converted into vectors which firstly, the number of dimensions is reduced, and secondly the orthogonality problem is solved. Experimental results show about the high quality of the resultant clusters.