C4.5: programs for machine learning
C4.5: programs for machine learning
Estimating attributes: analysis and extensions of RELIEF
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Machine Learning
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A classifier for semi-structured documents
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF
Applied Intelligence
Machine Learning
Theoretical and Empirical Analysis of ReliefF and RReliefF
Machine Learning
PEBL: Web Page Classification without Negative Examples
IEEE Transactions on Knowledge and Data Engineering
Using urls and table layout for web classification tasks
Proceedings of the 13th international conference on World Wide Web
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
A comparison of event models for Naive Bayes anti-spam e-mail filtering
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Improvements to Platt's SMO Algorithm for SVM Classifier Design
Neural Computation
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Automatic Web Page Classification Using Various Features
PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Hi-index | 0.00 |
A great challenge of web mining arises from the increasingly large web pages and the high dimensionality associated with natural language. Since classifying web pages of an interesting class is often the first step of mining the web, web page categorization/classification is one of the essential techniques for web mining. One of the main challenges of web page classification is the high dimensional text vocabulary space. In this research, we propose a Hidden Naive Bayes based method for web page classification. We also propose to use the ReliefF feature selection method for selecting relevant words to improve the classification performance. Comparisons with traditional techniques are provided. Results on benchmark dataset show that the proposed methods are promising for accurate web page classification.