Regression testing for wrapper maintenance
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Automating Web navigation with the WebVCR
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Wrapper induction: efficiency and expressiveness
Artificial Intelligence - Special issue on Intelligent internet systems
A brief survey of web data extraction tools
ACM SIGMOD Record
Proceedings of the 27th International Conference on Very Large Data Bases
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
Semi-Automatic Wrapper Generation for Commercial Web Sources
Proceedings of the IFIP TC8 / WG8.1 Working Conference on Engineering Information Systems in the Internet Context
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Schema-guided wrapper maintenance for web-data extraction
WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Crawling for Domain-Speci.c Hidden Web Resources
WISE '03 Proceedings of the Fourth International Conference on Web Information Systems Engineering
Understanding Web query interfaces: best-effort parsing with hidden syntax
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient Wrapper Reinduction from Dynamic Web Sources
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
ITPilot: A Toolkit for Industrial-Strength Web Data Extraction
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Merging Source Query Interfaces onWeb Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Interactive wrapper generation with minimal user effort
Proceedings of the 15th international conference on World Wide Web
Structured Data Extraction from the Web Based on Partial Tree Alignment
IEEE Transactions on Knowledge and Data Engineering
Wrapper maintenance: a machine learning approach
Journal of Artificial Intelligence Research
Automated browsing in AJAX websites
Data & Knowledge Engineering
Financial news semantic search engine
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
A substantial subset of Web data has an underlying structure. For instance, the pages obtained in response to a query executed through a Web search form are usually generated by a program that accesses structured data in a local database, and embeds them into an HTML template. For software programs to gain full benefit from these ''semi-structured'' Web sources, wrapper programs must be built to provide a ''machine-readable'' view over them. Since Web sources are autonomous, they may experience changes that invalidate the current wrapper, thus automatic maintenance is an important issue. Wrappers must perform two tasks: navigating through Web sites and extracting structured data from HTML pages. While several works have addressed the automatic maintenance of data extraction tasks, the problem of maintaining the navigation sequences remains unaddressed to the best of our knowledge. In this paper, we propose a set of novel techniques to fill this gap.