Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Web classification using support vector machine
Proceedings of the 4th international workshop on Web information and data management
Web unit mining: finding and classifying subgraphs of web pages
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Fast webpage classification using URL features
Proceedings of the 14th ACM international conference on Information and knowledge management
On identifying academic homepages for digital libraries
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Researcher homepage classification using unlabeled data
Proceedings of the 22nd international conference on World Wide Web
Hi-index | 0.00 |
We propose a web page classification method for creating a high quality collection of researchers' homepages. A method to reduce manual assessment required for assuring given precision/recall using a recall-assured and a precision-assured classifier is presented. Each classifier is built with SVM using textual features obtained from each page and its surrounding pages and tuning parameters. These pages are grouped based on connection types and relative URL hierarchy levels, and independent features are extracted from each group. Experiment results show the proposed features evidently improve classification performance and the manual assessment is significantly reduced.