ACM SIGMOD Record
Wrapper induction: efficiency and expressiveness
Artificial Intelligence - Special issue on Intelligent internet systems
IEPAD: information extraction based on pattern discovery
Proceedings of the 10th international conference on World Wide Web
Querying websites using compact skeletons
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Monadic datalog and the expressive power of languages for web information extraction
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Discovering Structural Association of Semistructured Data
IEEE Transactions on Knowledge and Data Engineering
XPath processing in a nutshell
ACM SIGMOD Record
Monadic Queries over Tree-Structured Data
LICS '02 Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science
Visual Web Information Extraction with Lixto
Proceedings of the 27th International Conference on Very Large Data Bases
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
Wiccap Data Model: Mapping Physical Websites to Logical Views
ER '02 Proceedings of the 21st International Conference on Conceptual Modeling
Representing and Querying Semistructured Web Data Using Nested Tables with Structural Variants
ER '02 Proceedings of the 21st International Conference on Conceptual Modeling
CSL '02 Proceedings of the 16th International Workshop and 11th Annual Conference of the EACSL on Computer Science Logic
A Unified Framework for Wrapping, Mediating and Restructuring Information from the Web
ER '99 Proceedings of the Workshops on Evolution and Change in Data Management, Reverse Engineering in Information Systems, and the World Wide Web and Conceptual Modeling
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
WICCAP: From Semi-structured Data to Structured Data
ECBS '04 Proceedings of the 11th IEEE International Conference and Workshop on Engineering of Computer-Based Systems
Query-by-example: a data base language
IBM Systems Journal
Mining travel resources on the web using l-wrappers
ICAISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Soft Computing
Hi-index | 0.00 |
Web data extraction systems in use today transform semi-structured Web documents and deliver structured documents to end users. Some systems provide a visual interface to users to generate the extraction rules. However, to end users, the visual effect of Web documents is lost during the transformation process. In this paper, we propose an approach that allows a user to query extracted documents without knowledge of formal query language. We bridge the gap between visual effect of Web documents and structured documents extracted by providing a QBE-like (Query by Example) interface named Wdee. The principle component of our method is the notion of a document schema. Document schemata are patterns of structures embedded in documents. Wdee generates tree skeletons based on schema information and a user may execute queries by input condition in the skeltons. By maintaining the mapping relation among schemata of Web documents and extracted documents, a visual example may be presented to end users. With the example, Wdee allows a user to construct tree skeletons in a manner that resembles the browsing of Web pages.