Efficient algorithms for finding maximum matching in graphs
ACM Computing Surveys (CSUR)
Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Block edit models for approximate string matching
Theoretical Computer Science - Special issue: Latin American theoretical informatics
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Learning object identification rules for information integration
Information Systems - Data extraction, cleaning and reconciliation
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem
Data Mining and Knowledge Discovery
Entity Identification in Database Integration
Proceedings of the Ninth International Conference on Data Engineering
Generic Schema Matching with Cupid
Proceedings of the 27th International Conference on Very Large Data Bases
Comparison of Schema Matching Evaluations
Revised Papers from the NODe 2002 Web and Database-Related Workshops on Web, Web-Services, and Database Systems
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Text joins in an RDBMS for web data integration
WWW '03 Proceedings of the 12th international conference on World Wide Web
Attribute Classification Using Feature Analysis
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
TAILOR: A Record Linkage Tool Box
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Instance-based attribute identification in database integration
The VLDB Journal — The International Journal on Very Large Data Bases
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
iMAP: discovering complex semantic matches between database schemas
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Eliminating fuzzy duplicates in data warehouses
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
COMA: a system for flexible combination of schema matching approaches
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Category translation: learning to understand information on the internet
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Relational data mapping in MIQIS
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Tuning schema matching software using synthetic scenarios
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Automatic data fusion with HumMer
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Automatic structured query transformation over distributed digital libraries
Proceedings of the 2006 ACM symposium on Applied computing
XML Mapping technology: making connections in an XML-centric world
IBM Systems Journal
eTuner: tuning schema matching software using synthetic scenarios
The VLDB Journal — The International Journal on Very Large Data Bases
Information retrieval and machine learning for probabilistic schema matching
Information Processing and Management: an International Journal
Why is schema matching tough and what can we do about it?
ACM SIGMOD Record
Matching large schemas: Approaches and evaluation
Information Systems
Query relaxation using malleable schemas
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Rank Aggregation for Automatic Schema Matching
IEEE Transactions on Knowledge and Data Engineering
An adaptive approach to schema classification for data warehouse modeling
Journal of Computer Science and Technology
Quickmig: automatic schema matching for data migration projects
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Schema mapping verification: the spicy way
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Automatically refining the wikipedia infobox ontology
Proceedings of the 17th international conference on World Wide Web
ACM Computing Surveys (CSUR)
Integrating web query results: holistic schema matching
Proceedings of the 17th ACM conference on Information and knowledge management
Advances in Web Semantics I
ODE: Ontology-assisted data extraction
ACM Transactions on Database Systems (TODS)
Combining a Logical and a Numerical Method for Data Reconciliation
Journal on Data Semantics XII
A Prioritized Collective Selection Strategy for Schema Matching across Query Interfaces
BNCOD 26 Proceedings of the 26th British National Conference on Databases: Dataspace: The Final Frontier
An instance-based approach for domain-independent schema matching
Proceedings of the 46th Annual Southeast Regional Conference on XX
A model for matching and integrating heterogeneous relational biomedical databases schemas
IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
A hierarchical approach to model web query interfaces for web source integration
Proceedings of the VLDB Endowment
Partial Ontology Matching Using Instance Features
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part II
Association pattern mining for product specification integration
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 2
Integrating schemas of heterogeneous relational databases through schema matching
Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
Redundancy-driven web data extraction and integration
Procceedings of the 13th International Workshop on the Web and Databases
Editorial: Revising the constraints of lightweight mediated schemas
Data & Knowledge Engineering
Data integration systems for scientific applications
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems
Synthesizing products for online catalogs
Proceedings of the VLDB Endowment
Holistic schema matching for web query interfaces
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Federating location-based data services
Data Management in a Connected World
Information Sciences: an International Journal
OPAL: automated form understanding for the deep web
Proceedings of the 21st international conference on World Wide Web
Instance-Based matching of large ontologies using locality-sensitive hashing
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Matching Attributes across Overlapping Heterogeneous Data Sources Using Mutual Information
Journal of Database Management
Cross-lingual entity matching and infobox alignment in Wikipedia
Information Systems
Aligning freebase with the YAGO ontology
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Most data integration applications require a matching between the schemas of the respective data sets. We show how the existence of duplicates within these data sets can be exploited to automatically identify matching attributes. We describe an algorithm that first discovers duplicates among data sets with unaligned schemas and then uses these duplicates to perform schema matching between schemas with opaque column names. Discovering duplicates among data sets with unaligned schemas is more difficult than in the usual setting, because it is not clear which fields in one object should be compared with which fields in the other. We have developed a new algorithm that efficiently finds the most likely duplicates in such a setting. Now, our schema matching algorithm is able to identify corresponding attributes by comparing data values within those duplicate records. An experimental study on real-world data shows the effectiveness of this approach.