Iterative record linkage for cleaning and integration
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Semantic integration in text: from ambiguous names to identifiable entities
AI Magazine - Special issue on semantic integration
Joint deduplication of multiple record types in relational data
Proceedings of the 14th ACM international conference on Information and knowledge management
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Collective entity resolution in relational data
Collective entity resolution in relational data
Hi-index | 0.00 |
Entity resolution is a critical component of data integration where the goal is to reconcile database references corresponding to the same real-world entities. Given the abundance of publicly available databases that have unresolved entities, we motivate the problem of quick and accurate resolution for answering queries over such 'unclean' databases. Since collective entity resolution approaches - where related references are resolved jointly - have been shown to be more accurate than independent attribute-based resolution, we focus on adapting collective resolution for answering queries. We propose a two-stage collective resolution strategy for processing queries. We then show how it can be performed on-the-fly by adaptively extracting and resolving those database references that are the most helpful for resolving the query. We validate our approach on two large real-world publication databases where we show the usefulness of collective resolution and at the same time demonstrate the need for adaptive strategies for query processing. We then show how the same queries. can be answered in real time using our adaptive approach while preserving the gains of collective resolution. This work extends work presented in (Bhattacharya, Licamele, & Getoor 2006).