SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Generating finite-state transducers for semi-structured data extraction from the Web
Information Systems - Special issue on semistructured data
Conceptual-model-based data extraction from multiple-record Web pages
Data & Knowledge Engineering
Building intelligent web applications using lightweight wrappers
Data & Knowledge Engineering - Special issue on heterogeneous information resources need semantic access
Amilcare: adaptive information extraction for document annotation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
World Wide Web
Hierarchical Wrapper Induction for Semistructured Information Sources
Autonomous Agents and Multi-Agent Systems
Ontology Specification Languages for the Semantic Web
IEEE Intelligent Systems
MnM: Ontology Driven Semi-automatic and Automatic Support for Semantic Markup
EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
S-CREAM - Semi-automatic CREAtion of Metadata
EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
CREAM: CREAting metadata for the Semantic Web
Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue: The Semantic Web: an evolution for a revolution
Towards the self-annotating web
Proceedings of the 13th international conference on World Wide Web
SECO: Mediation Services for Semantic Web Data
IEEE Intelligent Systems
Retrieving and Semantically Integrating Heterogeneous Data from the Web
IEEE Intelligent Systems
Automatic information extraction from large websites
Journal of the ACM (JACM)
Gimme' the context: context-driven automatic semantic annotation with C-PANKOW
WWW '05 Proceedings of the 14th international conference on World Wide Web
Enterprise information integration: successes, challenges and controversies
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Reconfigurable Web Wrapper Agents
IEEE Intelligent Systems
IEEE Intelligent Systems
Profile-Based Object Matching for Information Integration
IEEE Intelligent Systems
Semantic annotation, indexing, and retrieval
Web Semantics: Science, Services and Agents on the World Wide Web
FAETON: Form Analysis and Extraction Tool for ONtology construction
International Journal of Computer Applications in Technology
TEX: An efficient and effective unsupervised Web information extractor
Knowledge-Based Systems
An unsupervised technique to extract information from semi-structured web pages
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Secure federation of semantic information services
Decision Support Systems
Hi-index | 0.00 |
One the most challenging problems for Enterprise Information Integration is to deal with heterogeneous information sources on the Web. The reason is that they usually provide information that is in human-readable form only, which makes it difficult for a software agent to understand it. Current solutions build on the idea of annotating the information with semantics. If the information is unstructured, proposals such as S-CREAM, MnM, or Armadillo may be effective enough since they rely on using natural language processing techniques; furthermore, their accuracy can be improved by using redundant information on the Web, as C-PANKOW has proved recently. If the information is structured and closely related to a back-end database, Deep Annotation ranges among the most effective proposals, but it requires the information providers to modify their applications; if Deep Annotation is not applicable, the easiest solution consists of using a wrapper and transforming its output into annotations. In this paper, we prove that this transformation can be automated by means of an efficient, domain-independent algorithm. To the best of our knowledge, this is the first attempt to devise and formalize such a systematic, general solution.