Generating finite-state transducers for semi-structured data extraction from the Web
Information Systems - Special issue on semistructured data
Wrapper induction for information extraction
Wrapper induction for information extraction
A flexible learning system for wrapping tables and lists in HTML documents
Proceedings of the 11th international conference on World Wide Web
Semantic anomaly detection in online data sources
Proceedings of the 24th International Conference on Software Engineering
Proceedings of the 24th International Conference on Software Engineering
Enabling automatic adaptation in systems with under-specified elements
WOSS '02 Proceedings of the first workshop on Self-healing systems
IEEE Intelligent Systems
A Case-Based Recognition of Semantic Structures in HTML Documents
IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
A Case-Based Transformation from HTML to XML
IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
Automatic Extraction of Semantically-Meaningful Information from the Web.
AH '02 Proceedings of the Second International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems
Integrating and customizing heterogeneous e-commerce applications
The VLDB Journal — The International Journal on Very Large Data Bases
Semi-automatic wrapper generation and adaption: living with heterogeneity in a market environment
Enterprise information systems IV
Accurately and reliably extracting data from the Web: a machine learning approach
Intelligent exploration of the web
Schema-guided wrapper maintenance for web-data extraction
WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Constraint-based wrapper specification and verification for cooperative information systems
Information Systems - Special issue: Data quality in cooperative information systems
Automatic wrapper maintenance for semi-structured web sources using results from previous queries
Proceedings of the 2005 ACM symposium on Applied computing
Using machine learning to maintain rule-based named-entity recognition and classification systems
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Documentum ECI self-repairing wrappers: performance analysis
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Automatically maintaining wrappers for semi-structured web sources
Data & Knowledge Engineering
Automatically maintaining navigation sequences for querying semi-structured web sources
Data & Knowledge Engineering
Foundations and Trends in Databases
Wrapper maintenance: a machine learning approach
Journal of Artificial Intelligence Research
APWeb'03 Proceedings of the 5th Asia-Pacific web conference on Web technologies and applications
No Code Required: Giving Users Tools to Transform the Web
No Code Required: Giving Users Tools to Transform the Web
Adaptive information extraction: core technologies for information agents
Intelligent information agents
Web page analysis based on HTML DOM and its usage for forum statistics and alerts
ECC'10 Proceedings of the 4th conference on European computing conference
WSEAS Transactions on Computers
Adaptable wrapper generation for web page format change
ACOS'06 Proceedings of the 5th WSEAS international conference on Applied computer science
Intelligent self-repairable web wrappers
AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
Mechanisms of knowledge evolution for web information extraction
Proceedings of the 2005 international conference on Federation over the Web
Maintaining web navigation flows for wrappers
DEECS'06 Proceedings of the Second international conference on Data Engineering Issues in E-Commerce and Services
Learning to adapt cross language information extraction wrapper
Applied Intelligence
TEX: An efficient and effective unsupervised Web information extractor
Knowledge-Based Systems
Intelligent and adaptive crawling of web applications for web archiving
ICWE'13 Proceedings of the 13th international conference on Web Engineering
Hi-index | 0.00 |
Recent work on Internet information integration assumes a library of wrappers, specialized information extraction procedures. Maintaining wrappers is difficult, because the formatting regularities on which they rely often change. The wrapper verification problem is to determine whether a wrapper is correct. Standard regression testing approaches are inappropriate, because both the formatting regularities and a site's underlying content may change. Wei ntroduce RAPTURE, a fully-implemented, domain-independenvt erification algorithm. RAPTURE uses well-motivated heuristics to compute the similarity between a wrapper's expected and observed output. Experiments with 27 actual Internet sites show a substantial performance improvement over standard regression testing.