Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Learning to map between ontologies on the semantic web
Proceedings of the 11th international conference on World Wide Web
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
A machine learning approach to coreference resolution of noun phrases
Computational Linguistics - Special issue on computational anaphora resolution
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
Domain Independent Learning of Ontology Mappings
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Journal of the American Society for Information Science and Technology
Comparative study of name disambiguation problem using a scalable blocking-based framework
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Entity Resolution with Markov Logic
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Canonicalization of database records using adaptive similarity measures
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Improving author coreference by resource-bounded information gathering from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
BLOG: probabilistic models with unknown objects
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Efficient top-k count queries over imprecise duplicates
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Using Wikipedia to bootstrap open information extraction
ACM SIGMOD Record
Generic Entity Resolution in Relational Databases
ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
Instance-based domain ontological view creation towards semantic integration
Expert Systems with Applications: An International Journal
Synthesizing products for online catalogs
Proceedings of the VLDB Endowment
Collective graph identification
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Coreference aware web object retrieval
Proceedings of the 20th ACM international conference on Information and knowledge management
Active XML (AXML) intensional data exchange
Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
Attribute and object selection queries on objects with probabilistic attributes
ACM Transactions on Database Systems (TODS)
IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
Flexible and efficient distributed resolution of large entities
FoIKS'12 Proceedings of the 7th international conference on Foundations of Information and Knowledge Systems
Confidence-Based incremental classification for objects with limited attributes in vertical search
IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Structuring e-commerce inventory
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Joint entity and event coreference resolution across documents
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Name disambiguation in scientific cooperation network by exploiting user feedback
Artificial Intelligence Review
Hi-index | 0.00 |
The automatic consolidation of database records from many heterogeneous sources into a single repository requires solving several information integration tasks. Although tasks such as coreference, schema matching, and canonicalization are closely related, they are most commonly studied in isolation. Systems that do tackle multiple integration problems traditionally solve each independently, allowing errors to propagate from one task to another. In this paper, we describe a discriminatively-trained model that reasons about schema matching, coreference, and canonicalization jointly. We evaluate our model on a real-world data set of people and demonstrate that simultaneously solving these tasks reduces errors over a cascaded or isolated approach. Our experiments show that a joint model is able to improve substantially over systems that either solve each task in isolation or with the conventional cascade. We demonstrate nearly a 50% error reduction for coreference and a 40% error reduction for schema matching.