Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Link Based Clustering of Web Search Results
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Link mining: a new data mining challenge
ACM SIGKDD Explorations Newsletter
Text Classification by Boosting Weak Learners based on Terms and Concepts
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Active semi-supervised fuzzy clustering for image database categorization
Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Web Semantics: Science, Services and Agents on the World Wide Web
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Clustering is one of the most important approaches for mining and extracting knowledge from the web. In this paper a method for clustering the web data is presented which using a Bayesian network, finds appropriate representatives for each of the clusters. Having those representatives, we can create more accurate clusters. Also the contents of the web pages are converted into vectors which firstly, the number of dimensions is reduced, and secondly the orthogonality problem is solved. Experimental results show about the high quality of the resultant clusters.