Retrieving and organizing web pages by “information unit”
Proceedings of the 10th international conference on World Wide Web
Web unit mining: finding and classifying subgraphs of web pages
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Combining link-based and content-based methods for web document classification
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Hi-index | 0.00 |
In the web space, information of an entity is often presented by a set of pages that constitutes a logical page group and its proper handling is required. This paper proposes a method for collecting researchers' homepages (or entry pages) by applying new simple and effective page group models for combining page group structure and page content, aiming at narrowing down the candidates for further precise and heavy processes. We mainly focus on high recall but less on precision.