Rules revisited: web page classification

Authors:
Aristotelis Katsaris;Isambo Karali
Affiliations:
University of Athens, Athens, Greece;University of Athens, Athens, Greece
Venue:
CI '07 Proceedings of the Third IASTED International Conference on Computational Intelligence
Year:
2007

Citing 6
Cited 0

Web mining research: a survey

ACM SIGKDD Explorations Newsletter
Learning Structural Classification Rules for Web-Page Categorization

Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference
Improving Category Specific Web Search by Learning Query Modifications

SAINT '01 Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001)
Web-page classification through summarization

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A comparison of implicit and explicit links for web page classification

Proceedings of the 15th international conference on World Wide Web
Automatic Web Page Categorization using Principal Component Analysis

HICSS '07 Proceedings of the 40th Annual Hawaii International Conference on System Sciences

Quantified Score

Hi-index	0.00

Visualization

Abstract

The importance of the problem of web page classification grows significantly with the continuous increase of the information available in the Internet. Web page classification serves two purposes: filtering the enormous search space on the Web by considering only relevant pages when attempting to locate a specific kind of information, providing some semantic information when trying to access high precision results. To classify a Web page, its structure should be considered together with its text content. In this paper, we present our approach, which deals with the problem by using derivation rules and heuristics as well as analysis of the web page structure at a high semantic level. This approach was implemented in the ExpertCat system.