OXPath: little language, little memory, great value

Authors:
Andrew Jon Sellers;Tim Furche;Georg Gottlob;Giovanni Grasso;Christian Schallhart
Affiliations:
University of Oxford, Oxford, United Kingdom;University of Oxford, Oxford, United Kingdom;University of Oxford, Oxford, United Kingdom;University of Oxford, Oxford, United Kingdom;University of Oxford, Oxford, United Kingdom
Venue:
Proceedings of the 20th international conference companion on World wide web
Year:
2011

Citing 6
Cited 2

Visual Web Information Extraction with Lixto

Proceedings of the 27th International Conference on Very Large Data Bases
Automation and customization of rendered web pages

Proceedings of the 18th annual ACM symposium on User interface software and technology
Conditional XPath

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Declarative information extraction using datalog with embedded extraction predicates

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Transcendence: enabling a personal view of the deep web

Proceedings of the 13th international conference on Intelligent user interfaces
Accessing the deep web: when good ideas go bad

Companion to the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications

How the minotaur turned into ariadne: ontologies in web data extraction

ICWE'11 Proceedings of the 11th international conference on Web engineering
Query induction with schema-guided pruning strategies

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data about everything is readily available on the web-but often only accessible through elaborate user interactions. For automated decision support, extracting that data is essential, but infeasible with existing heavy-weight data extraction systems. In this demonstration, we present OXPath, a novel approach to web extraction, with a system that supports informed job selection and integrates information from several different web sites. By carefully extending XPath, OXPath exploits its familiarity and provides a light-weight interface, which is easy to use and embed. We highlight how OXPath guarantees optimal page buffering, storing only a constant number of pages for non-recursive queries.