C4.5: programs for machine learning
C4.5: programs for machine learning
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Swarm intelligence
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A clustering strategy based on a formalism of the reproductive process in natural systems
SIGIR '79 Proceedings of the 2nd annual international ACM SIGIR conference on Information storage and retrieval: information implications into the eighties
Neural Networks
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
Ontology Learning for the Semantic Web
IEEE Intelligent Systems
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Refining Initial Points for K-Means Clustering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Link Based Clustering of Web Search Results
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
KAON - Towards a Large Scale Semantic Web
EC-WEB '02 Proceedings of the Third International Conference on E-Commerce and Web Technologies
On Combining Link and Contents Information for Web Page Clustering
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Bidirectional Hierarchical Clustering for Web Mining
WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
Link mining: a new data mining challenge
ACM SIGKDD Explorations Newsletter
Text Classification by Boosting Weak Learners based on Terms and Concepts
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Learning to integrate web taxonomies
Web Semantics: Science, Services and Agents on the World Wide Web
Web Semantics: Science, Services and Agents on the World Wide Web
AntClust: ant clustering and web usage mining
GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartI
Clustering distributed data streams in peer-to-peer environments
Information Sciences: an International Journal
No free lunch theorems for optimization
IEEE Transactions on Evolutionary Computation
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Optimal adaptive k-means algorithm with dynamic adjustment of learning rate
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Clustering Web data is one important technique for extracting knowledge from the Web. In this paper, a novel method is presented to facilitate the clustering. The method determines the appropriate number of clusters and provides suitable representatives for each cluster by inference from a Bayesian network. Furthermore, by means of the Bayesian network, the contents of the Web pages are converted into vectors of lower dimensions. The method is also extended for hierarchical clustering, and a useful heuristic is developed to select a good hierarchy. The experimental results show that the clusters produced benefit from high quality. (The value of this threshold is a subjective issue that depends on the human perceptions of relevancy, precision, and recall. It can be easily determined by some limited human-oriented examinations. © 2012 Wiley Periodicals, Inc.)