Template-based wrappers in the TSIMMIS system
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Database techniques for the World-Wide Web: a survey
ACM SIGMOD Record
A brief survey of web data extraction tools
ACM SIGMOD Record
A Conceptual Model and Rule-Based Query Language for HTML
World Wide Web
Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Wrapper Generation for Web Accessible Data Sources
COOPIS '98 Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Hi-index | 0.00 |
The key technology in comparison-shopping is the online goods information extraction. Based on DOM, the information extraction with two-stage working pattern and the conception of page information unit have been proposed after a large number of sample pages testing. PIU is extracted and categorized by the classifying algorithm, and information is extracted from PIU. It is implemented that the key information of online goods is extracted based on the above-mentioned information extraction algorithm. It shows that the algorithm is steady and has higher Recall and Precision rate with the sample page testing.