Meaningful change detection in structured data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
WebCQ-detecting and delivering information changes on the web
Proceedings of the ninth international conference on Information and knowledge management
Sams Teach Yourself HTML 4 in 24 Hours
Sams Teach Yourself HTML 4 in 24 Hours
An Automated Change Detection Algorithm for HTML Documents Based on Semantic Hierarchies
Proceedings of the 17th International Conference on Data Engineering
Efficient and effective web change detection
Data & Knowledge Engineering
Detecting Changes in XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximating Edit Distance Efficiently
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
CX-DIFF: a change detection algorithm for XML content and change visualization for WebVigiL
Data & Knowledge Engineering - Special issue: XML schema and data management
Data & Knowledge Engineering
Topical web crawling using weighted anchor text and web page change detection techniques
WSEAS Transactions on Information Science and Applications
Towards the Extraction of Intelligence about Competitor from the Web
WSKS '09 Proceedings of the 2nd World Summit on the Knowledge Society: Visioning and Engineering the Knowledge Society. A Web Science Perspective
Splitter: a proxy-based approach for post-migration testing of web applications
Proceedings of the 5th European conference on Computer systems
Hi-index | 0.00 |
This paper describes an efficient Web page change detection system based on three optimizations that were implemented on top of the Hungarian algorithm, which we employ to compare trees that correspond to HTML Web pages. The optimizations attempt to stop the comparator algorithm that employs this O(n^{3}) algorithm before it completes all its iterations based on criteria having to do with properties of HTML and heuristics. Analysis and experimental results prove the effectiveness of these optimizations and their ability to render O(n^{2}) performance, where n denotes the number of nodes in the tree. A complete system was implemented and used to carry out the performance experiments. This system includes functionalities and interfaces for processing user requests, fetching Web pages from the Internet, allowing users to select zones in Web pages to monitor, and highlighting changes on the Web pages being monitored.