Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
IEPAD: information extraction based on pattern discovery
Proceedings of the 10th international conference on World Wide Web
Modern Information Retrieval
Hi-index | 0.00 |
A large amount of book information lies scattered on the Web. It is written in non-standard forms, as is other Web information, and this imposes a heavy load on a user browsing search results. We propose an automatic editing method that assists users to retrieve book information, especially book reviews scattered on the Web. Our proposed system retrieves the bibliographic information of a user-specified book using a library catalog database. Using this, it retrieves book reviews on the Web, which are then automatically edited using some heuristic rules for segment extraction, filtering and sorting according to a semantic likelihood of their being book reviews, and are finally presented in tables to the users. We implemented a prototype system and performed a preliminary evaluation of its effectiveness by experiment.