Accessibility of information on the Web
intelligence
Machine Learning
Fuzzy System Design Principles
Fuzzy System Design Principles
Modern Information Retrieval
Text-Learning and Related Intelligent Agents: A Survey
IEEE Intelligent Systems
Information Retrieval on the World Wide Web
IEEE Internet Computing
On fuzzy logic applications for automatic control, supervision, and fault diagnosis
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Journal of the American Society for Information Science and Technology
Web page classification: a soft computing approach
AWIC'03 Proceedings of the 1st international Atlantic web intelligence conference on Advances in web intelligence
Learning a taxonomy from a set of text documents
Applied Soft Computing
Hi-index | 0.00 |
This paper addresses the issue of an adequate representation of a web page, to perform further on classification and data mining. The approach focuses the textual part of web pages, which are represented by a two-dimension vector. The vector components are sorted by the relevance of each word in the text. Two approaches, analytical and fuzzy, that take advantage of characteristics of the HTML language are presented to compute the word relevance. Both models are contrasted in learning and classification tasks, to evaluate the suitability of each approach. The experiments show an obvious improvement of fuzzy method versus analytical one. The analytical and fuzzy approaches here presented are general, in the sense that every characteristic of the web pages could be easily integrated without additional cost.