Web Document Classification Based on Fuzzy k-NN Algorithm

Authors:
Juan Zhang;Yi Niu;Huabei Nie
Affiliations:
-;-;-
Venue:
CIS '09 Proceedings of the 2009 International Conference on Computational Intelligence and Security - Volume 01
Year:
2009

Citing 0
Cited 3

Automatic mining of cognitive metadata using fuzzy inference

Proceedings of the 22nd ACM conference on Hypertext and hypermedia
Automatic metadata mining from multilingual enterprise content

Web Semantics: Science, Services and Agents on the World Wide Web
Fuzzy nearest neighbor algorithms: Taxonomy, experimental analysis and prospects

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web document classification is an important technique of web mining. Web pages classification has been studied extensively since the Internet has become a huge database of information. The k-NN is a simple classification algorithm that is used to assign patterns of unknown classification to the class of the majority of its k nearest neighbors of known classification according to the distance measure, but a main drawback of the method is that each of the patterns of known classification is considered equally important in the assignment of the pattern to be classified. Fuzzy k-nearest neighbor (fuzzy k-NN) is improving algorithm of k-NN, which is applied successfully in structural data classification. This paper presents the web document classification based on fuzzy k-NN network, in the process of classification, TF/IDF (term frequency / inverse document frequency) is adopted for selecting features of document, to increase the accuracy and suit for real world, membership grade is used. Experimental results show that classification performance is better than both k-NN and support vector machine (SVM).