Semantic integration of semistructured and structured data sources
ACM SIGMOD Record
Metadata management for federated databases
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A graph method for keyword-based selection of the top-K databases
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Information retrieval from digital libraries in SQL
Proceedings of the 10th ACM workshop on Web information and data management
Enhanced Business Intelligence using EROCS
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
DBDOC: querying and browsing databases and interrelated documents
Proceedings of the First International Workshop on Keyword Search on Structured Data
Keyword search in databases: the power of RDBMS
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Keyword search across databases and documents
Proceedings of the 2nd International Workshop on Keyword Search on Structured Data
Hi-index | 0.00 |
There exist many interrelated information sources on the Internet that can be categorized into structured (database) and semistructured (documents). A key challenge is to integrate, query and analyze such heterogeneous collections of information. In this paper, we defend the idea of building web metadata repositories using relational databases as the main source and central data management technology of structured data, enriched by the semistructured data surrounding it. Our proposal rests on the assumption that heterogeneous relational databases can be integrated (i.e. entity resolution is assumed to work well) and thus can serve as references for external data. That is, we tackle the problem of integrating information in the deep web, departing from databases. We discuss a prototype system that can integrate and query metadata and related documents, based on relational database technology. Metadata includes database ER model elements like database name, table, and column (entity, attribute). Web document data include files, documents and web pages. Links between metadata and external documents are built with SQL queries. Once databases and documents are linked, they are managed and queried with SQL. We discuss an interesting scientific application of our solution with a water pollution database.