SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Record linkage: similarity measures and algorithms
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Example-driven design of efficient record matching queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient similarity joins for near duplicate detection
Proceedings of the 17th international conference on World Wide Web
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Learning string transformations from examples
Proceedings of the VLDB Endowment
On Graph-Based Name Disambiguation
Journal of Data and Information Quality (JDIQ)
Hi-index | 0.00 |
In this paper, we consider the entity resolution(ER) problem, which is to identify objects referring to the same real-world entity. Prior work of ER involves expensive similarity comparison and clustering approaches. Additionally, the quality of entity resolution may be low due to insufficient information. To address these problems, by adopting context information of data objects, we present a novel framework of entity resolution, context-based entity description (CED), to make context information help entity resolution. In our framework, each entity is described by a set of CEDs. During entity resolution, objects are only compared with CEDs to determine its corresponding entity. Additionally, we propose efficient algorithms for CED discovery and CED-based entity resolution. We experimentally evaluated our CED-based ER algorithm on the real DBLP datasets, and the experimental results show that our algorithm can achieve both high precision and recall as well as outperform existing methods.