The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Proceedings of the third annual conference on Autonomous Agents
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Multistrategy Learning for Information Extraction
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
Database Schema Matching Using Machine Learning with Feature Selection
CAiSE '02 Proceedings of the 14th International Conference on Advanced Information Systems Engineering
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to match and cluster large high-dimensional data sets for data integration
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On schema matching with opaque column names and data values
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Probabilistic reasoning for entity & relation recognition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Eliminating fuzzy duplicates in data warehouses
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
COMA: a system for flexible combination of schema matching approaches
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Transferring and retraining learned information filters
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Automatically utilizing secondary sources to align information across sources
AI Magazine - Special issue on semantic integration
Semantic integration in text: from ambiguous names to identifiable entities
AI Magazine - Special issue on semantic integration
Semantic-integration research in the database community
AI Magazine - Special issue on semantic integration
IEEE Transactions on Knowledge and Data Engineering
Matching knowledge elements in concept maps using a similarity flooding algorithm
Decision Support Systems
Integration of Ontology Data through Learning Instance Matching
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
A strategy for allowing meaningful and comparable scores in approximate matching
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Data & Knowledge Engineering
A Graph Partitioning Approach to Entity Disambiguation Using Uncertain Information
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
A strategy for allowing meaningful and comparable scores in approximate matching
Information Systems
A strategy for allowing meaningful and comparable scores in approximate matching
Information Systems
Identification and tracing of ambiguous names: discriminative and generative approaches
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Constraint-based entity matching
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Generic entity resolution with negative rules
The VLDB Journal — The International Journal on Very Large Data Bases
Entity-aware query processing for heterogeneous data with uncertainty and correlations
Proceedings of the 2009 EDBT/ICDT Workshops
Duplicate detection through structure optimization
Proceedings of the 20th ACM international conference on Information and knowledge management
XML duplicate detection using sorted neighborhoods
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
A semantic enrichment of data tables applied to food risk assessment
DS'05 Proceedings of the 8th international conference on Discovery Science
Hi-index | 0.00 |
Object matching is a fundamental problem that arises in numerous information integration scenarios. Virtually all existing solutions assume that the objects to be matched share the same attribute set and that systems can match them by comparing attribute similarities. Our work addresses the more general problem in which objects also have disjoint attributes-for example, matching tuples from relational tables that have different schemas, such as (age, name) and (name, salary). Profile-Based Object Matching, which applies this idea, exploits disjoint attributes to improve matching accuracy. PROM first matches any two tuples based on a shared attribute, such as name. It then applies a set of profilers, each of which contains some knowledge about what constitutes a typical person. The profilers examine the tuple pair to see if it plausibly describes a person. A profiler might state, for example, that if the pair produces a person with an age of 6 and a salary of $100,000, the pair doesn't describe a real person, so the tuples don't match. Profilers can be manually specified by domain experts, trained on training data, transferred from other matching tasks, or built from external data. PROM is thus distinct in that it not only exploits disjoint attributes to improve matching accuracy but also facilitates knowledge reuse from previous object-matching tasks.