Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
The AWK programming language
Programming perl
A scalable comparison-shopping agent for the World-Wide Web
AGENTS '97 Proceedings of the first international conference on Autonomous agents
Wrapper generation for semi-structured Internet sources
ACM SIGMOD Record
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Ontology-based extraction and structuring of information from data-rich unstructured documents
Proceedings of the seventh international conference on Information and knowledge management
The String-to-String Correction Problem
Journal of the ACM (JACM)
Conceptual-model-based data extraction from multiple-record Web pages
Data & Knowledge Engineering
A Conceptual-Modeling Approach to Extracting Data from the Web
ER '98 Proceedings of the 17th International Conference on Conceptual Modeling
Constraint-based wrapper specification and verification for cooperative information systems
Information Systems - Special issue: Data quality in cooperative information systems
Hi-index | 0.00 |
A considerable amount of clean semistructured data is internally available to companies in the form of business reports. However, business reports are untapped for data mining, data warehousing, and querying because they are not in relational form. Business reports have a regular structure that can be reconstructed. We present algorithms that automatically infer the regular structure underlying business reports and automatically generate wrappers to extract relational data.