On-line web database integration

Authors:
Hao Tan;Parisa Ghodous;Jacky Montiel
Affiliations:
LIRIS, University Lyon, Lyon, France;University Lyon, Lyon, France;ALTERNANCE Soft Lyon, France
Venue:
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Year:
2010

Citing 17
Cited 0

A comparative analysis of methodologies for database schema integration

ACM Computing Surveys (CSUR)
Infomaster: an information integration system

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
SEMINT: a tool for identifying attribute correspondences in heterogeneous databases using neural networks

Data & Knowledge Engineering
A brief survey of web data extraction tools

ACM SIGMOD Record
Automatic information extraction from semi-structured Web pages by pattern discovery

Decision Support Systems - Web retrieval and mining
Generic Schema Matching with Cupid

Proceedings of the 27th International Conference on Very Large Data Bases
Statistical schema matching across web query interfaces

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
HTML Page Analysis Based on Visual Cues

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Understanding Web query interfaces: best-effort parsing with hidden syntax

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Automatic complex schema matching across Web query interfaces: A correlation mining approach

ACM Transactions on Database Systems (TODS)
Wise-integrator: an automatic integrator of web search interfaces for E-commerce

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Instance-based schema matching for web databases by domain-specific query probing

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Context-aware wrapping: synchronized data extraction

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
ViDE: A Vision-Based Approach for Deep Web Data Extraction

IEEE Transactions on Knowledge and Data Engineering
The specification of visual language syntax

Journal of Visual Languages and Computing
Constructing interface schemas for search interfaces of web databases

WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Automatic data extraction from data-rich web pages

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Deep Web (often called hidden web or invisible web) is composed of all the web databases. With the evolution of the "deep web", more and more researchers pay attention to the "integration" of the web database. However, to achieve this goal, it needs a complex system and many applications to work together. We are interested in an automatic extracting system to get the formulas or the lists of the results from those websites in the specific domain of government procurement. To tackle this challenge, we propose a solution to create a unified interface and to inquire resources in a predefined domain. In this paper, we will discuss the automatic extracting system in several steps. First of all, the web query interfaces crawler which can execute JavaScript guarantees the coverage of the web database. Secondly, the query interface extractor and the interface integrator can allow us to query all these founded web databases through a global query interface. Thirdly, the result page extractor and the result integrator can give a unified presentation. Lastly, a feedback method is developed to gather the result accuracy. A statistical model is built to improve the performance of steps 2 and 3. We assume our system is a dynamic system, which means the more we use it, the better results we will get.