SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
XTRACT: a system for extracting document type descriptors from XML documents
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Wrapper induction: efficiency and expressiveness
Artificial Intelligence - Special issue on Intelligent internet systems
WebViews: accessing personalized web content and services
Proceedings of the 10th international conference on World Wide Web
Annotea: an open RDF infrastructure for shared Web annotations
Proceedings of the 10th international conference on World Wide Web
IEPAD: information extraction based on pattern discovery
Proceedings of the 10th international conference on World Wide Web
Effective Web data extraction with standard XML technologies
Proceedings of the 10th international conference on World Wide Web
ChangeDetector: a site-level monitoring tool for the WWW
Proceedings of the 11th international conference on World Wide Web
Visual Web Information Extraction with Lixto
Proceedings of the 27th International Conference on Very Large Data Bases
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
Jedi: Extracting and Synthesizing Information from the Web
COOPIS '98 Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems
Robust Pointing by XPath Language: Authoring Support and Empirical Evaluation
SAINT '03 Proceedings of the 2003 Symposium on Applications and the Internet
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
A bag of paths model for measuring structural similarity in Web documents
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
EShopMonitor: A Web Content Monitoring Tool
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Adaptive information extraction: core technologies for information agents
Intelligent information agents
Towards more personalized web: extraction and integration of dynamic content from the web
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Hi-index | 0.00 |
Typical commercial Web sites publish information from multiple back-end data sources; these data sources are also updated very frequently. Given the size of most commercial sites today, it becomes essential to have an automated means of checking for correctness and consistency of data. The eShopmonitor allows users to specify items of interest to be tracked, monitors these items on the Web pages, and reports on any changes observed. Our solution comprises a crawler, a miner, a reporter, and a user component that work together to achieve the above functionality. The miner learns to locate the items of interest on a class of pages based on just one sample supplied by the user, via the user interface (UI) provided. The learning algorithm is based on the XPaths of the Document Object Model (DOM) of the page.