Applied multivariate statistical analysis
Applied multivariate statistical analysis
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Self-organizing maps
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Neural Network Agents for Learning Semantic Text Classification
Information Retrieval
Hierarchical Text Categorization Using Neural Networks
Information Retrieval
A Study of Approaches to Hypertext Categorization
Journal of Intelligent Information Systems
Web page feature selection and classification using neural networks
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Informatics and computer science intelligent systems applications
Improving text categorization using the importance of sentences
Information Processing and Management: an International Journal
Intelligent document classification
Intelligent Data Analysis
Hi-index | 0.00 |
Web page classification is one of the essential techniques for Web mining. This paper presents a framework for Web page classification. It is hybrid architecture of neural network PCA (principle components analysis) and SOFM (self-organizing map). In order to perform the classification, a web page is firstly represented by a vector of features with different weights according to the term frequency and the importance of each sentence in the page. As the number of the features is big, PCA is used to select the relevant features. Finally the output of PCA is sent to SOFM for classification. To compare with the proposed framework, two conventional classifiers are used in our experiments: k-NN and Naïve Bayes. Our new method makes a significant improvement in classifications on both data sets compared with the two conventional methods.