A comparative analysis of methodologies for database schema integration
ACM Computing Surveys (CSUR)
A Theory of Attributed Equivalence in Databases with Application to Schema Integration
IEEE Transactions on Software Engineering
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Introduction to Algorithms
Generic Schema Matching with Cupid
Proceedings of the 27th International Conference on Very Large Data Bases
A Methodology for View Inegration in Logical Database Design
VLDB '82 Proceedings of the 8th International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
An interactive clustering-based approach to integrating source query interfaces on the deep Web
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Understanding Web query interfaces: best-effort parsing with hidden syntax
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
iMAP: discovering complex semantic matches between database schemas
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Knocking the door to the deep Web: integrating Web query interfaces
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Mining complex matchings across Web query interfaces
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Searching databases for sematically-related schemas
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Discovering complex matchings across web query interfaces: a correlation mining approach
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Abbreviation Expansion in Schema Matching and Web Integration
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Organizing structured web sources by query schemas: a clustering approach
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Structured databases on the web: observations and implications
ACM SIGMOD Record
Introduction to the special issue on semantic integration
ACM SIGMOD Record
A holistic paradigm for large scale schema matching
ACM SIGMOD Record
Editorial: special issue on web content mining
ACM SIGKDD Explorations Newsletter
Mining structures for semantics
ACM SIGKDD Explorations Newsletter
Mining semantics for large scale integration on the web: evidences, insights, and challenges
ACM SIGKDD Explorations Newsletter
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Towards Building a MetaQuerier: Extracting and Matching Web Query Interfaces
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Downloading textual hidden web content through keyword queries
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
MetaQuerier: querying structured web sources on-the-fly
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Making holistic schema matching robust: an ensemble approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Light-weight domain-based form assistant: querying web databases on the fly
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Tuning schema matching software using synthetic scenarios
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mapping maintenance for data integration systems
VLDB '05 Proceedings of the 31st international conference on Very large data bases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Semantic-integration research in the database community
AI Magazine - Special issue on semantic integration
Merging Interface Schemas on the Deep Web via Clustering Aggregation
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Automatic complex schema matching across Web query interfaces: A correlation mining approach
ACM Transactions on Database Systems (TODS)
Automatic structured query transformation over distributed digital libraries
Proceedings of the 2006 ACM symposium on Applied computing
Principles of dataspace systems
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Integration of XML schemas at various "severity" levels
Information Systems
Dealing with semantic heterogeneity for improving web usage
Data & Knowledge Engineering - Special issue: ER 2004
Data integration: the teenage years
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
eTuner: tuning schema matching software using synthetic scenarios
The VLDB Journal — The International Journal on Very Large Data Bases
QMatch - Using paths to match XML schemas
Data & Knowledge Engineering
Clustering e-commerce search engines based on their search interface pages using WISE-cluster
Data & Knowledge Engineering - Special issue: WIDM 2004
A composite approach to automating direct and indirect schema mappings
Information Systems
Combining classifiers to identify online databases
Proceedings of the 16th international conference on World Wide Web
An adaptive crawler for locating hidden-Web entry points
Proceedings of the 16th international conference on World Wide Web
Matching large schemas: Approaches and evaluation
Information Systems
Query relaxation using malleable schemas
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Rank Aggregation for Automatic Schema Matching
IEEE Transactions on Knowledge and Data Engineering
Wise-integrator: an automatic integrator of web search interfaces for E-commerce
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Structures, semantics and statistics
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Instance-based schema matching for web databases by domain-specific query probing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Randomized algorithms for data reconciliation in wide area aggregate query processing
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Automatically refining the wikipedia infobox ontology
Proceedings of the 17th international conference on World Wide Web
Towards a global schema for web entities
Proceedings of the 17th international conference on World Wide Web
Bootstrapping pay-as-you-go data integration systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Schema Matching across Query Interfaces on the Deep Web
BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
Efficient Top-k Data Sources Ranking for Query on Deep Web
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Learning to extract form labels
Proceedings of the VLDB Endowment
Integrating web query results: holistic schema matching
Proceedings of the 17th ACM conference on Information and knowledge management
Supporting the automatic construction of entity aware search engines
Proceedings of the 10th ACM workshop on Web information and data management
Querying structured information sources on the web
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Web-scale extraction of structured data
ACM SIGMOD Record
Computer Languages, Systems and Structures
A Prioritized Collective Selection Strategy for Schema Matching across Query Interfaces
BNCOD 26 Proceedings of the 26th British National Conference on Databases: Dataspace: The Final Frontier
Data Modeling in Dataspace Support Platforms
Conceptual Modeling: Foundations and Applications
Site-Wide Wrapper Induction for Life Science Deep Web Databases
DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
Deriving Customized Integrated Web Query Interfaces
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
An empirical study on using hidden markov model for search interface segmentation
Proceedings of the 18th ACM conference on Information and knowledge management
An evidential approach to query interface matching on the deep Web
Information Systems
Stop word and related problems in web interface integration
Proceedings of the VLDB Endowment
Proceedings of the 13th International Conference on Extending Database Technology
Parsing query interfaces of deep web: from specialization to generalization
IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
Association pattern mining for product specification integration
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 2
Schema clustering and retrieval for multi-domain pay-as-you-go data integration systems
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Querying structured information sources on the Web
International Journal of Metadata, Semantics and Ontologies
Tuning the ensemble selection process of schema matchers
Information Systems
Understanding deep web search interfaces: a survey
ACM SIGMOD Record
Web database schema identification through simple query interface
RED'09 Proceedings of the 2nd international conference on Resource discovery
Instance discovery and schema matching with applications to biological deep web data integration
DILS'10 Proceedings of the 7th international conference on Data integration in the life sciences
Double-layered schema integration of heterogeneous XML sources
Journal of Systems and Software
Materializing multi-relational databases from the web using taxonomic queries
Proceedings of the fourth ACM international conference on Web search and data mining
On-line web database integration
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
A query interface matching approach based on extended evidence theory for deep web
Journal of Computer Science and Technology
ETTA-IM: A deep web query interface matching approach based on evidence theory and task assignment
Expert Systems with Applications: An International Journal
Attribute domain discovery for hidden web databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Deep web integrated systems: current achievements and open issues
Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
Clustering-based schema matching of web data for constructing digital library
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part II
Constructing interface schemas for search interfaces of web databases
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Holistic schema matching for web query interfaces
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Clustering structured web sources: a schema-based, model-differentiation approach
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
sPLMap: a probabilistic approach to schema matching
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Automatic generation of data types for classification of deep web sources
DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences
Automatically grounding semantically-enriched conceptual models to concrete web services
ER'05 Proceedings of the 24th international conference on Conceptual Modeling
A novel clustering-based approach to schema matching
ADVIS'06 Proceedings of the 4th international conference on Advances in Information Systems
Chapter 6: web data extraction for service creation
Search Computing
Information retrieval from distributed semistructured documents using metadata interface
KDXD'06 Proceedings of the First international conference on Knowledge Discovery from XML Documents
InfoGather: entity augmentation and attribute discovery by holistic matching with web tables
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Extracting widget descriptions from GUIs
FASE'12 Proceedings of the 15th international conference on Fundamental Approaches to Software Engineering
Optimal algorithms for crawling a hidden database in the web
Proceedings of the VLDB Endowment
Learning to discover complex mappings from web forms to ontologies
Proceedings of the 21st ACM international conference on Information and knowledge management
Identifying and weighting integration hypotheses on open data platforms
Proceedings of the First International Workshop on Open Data
E-FFC: an enhanced form-focused crawler for domain-specific deep web databases
Journal of Intelligent Information Systems
Towards a More Scalable Schema Matching: A Novel Approach
International Journal of Distributed Systems and Technologies
Deep Web Information Retrieval Process: A Technical Survey
International Journal of Information Technology and Web Engineering
Assessing relevance and trust of the deep web sources and results based on inter-source agreement
ACM Transactions on the Web (TWEB)
Publish-time data integration for open data platforms
Proceedings of the 2nd International Workshop on Open Data
Schema matching prediction with applications to data source discovery and dynamic ensembling
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Schema matching is a critical problem for integrating heterogeneous information sources. Traditionally, the problem of matching multiple schemas has essentially relied on finding pairwise-attribute correspondence. This paper proposes a different approach, motivated by integrating large numbers of data sources on the Internet. On this "deep Web," we observe two distinguishing characteristics that offer a new view for considering schema matching: First, as the Web scales, there are ample sources that provide structured information in the same domains (e.g., books and automobiles). Second, while sources proliferate, their aggregate schema vocabulary tends to converge at a relatively small size. Motivated by these observations, we propose a new paradigm, statistical schema matching: Unlike traditional approaches using pairwise-attribute correspondence, we take a holistic approach to match all input schemas by finding an underlying generative schema model. We propose a general statistical framework MGS for such hidden model discovery, which consists of hypothesis modeling, generation, and selection. Further, we specialize the general framework to develop Algorithm MGSsd, targeting at synonym discovery, a canonical problem of schema matching, by designing and discovering a model that specifically captures synonym attributes. We demonstrate our approach over hundreds of real Web sources in four domains and the results show good accuracy.