Data Quality and Record Linkage Techniques
Data Quality and Record Linkage Techniques
Swoosh: a generic approach to entity resolution
The VLDB Journal — The International Journal on Very Large Data Bases
Entity Resolution and Information Quality
Entity Resolution and Information Quality
Journal of Computing Sciences in Colleges
A Graduate-Level Course on Entity Resolution and Information Quality: A Step toward ER Education
Journal of Data and Information Quality (JDIQ) - Special Issue on Entity Resolution
Hi-index | 0.00 |
This paper describes the experience of constructing and deploying a significant exercise in entity resolution as a way to more closely simulate the challenges often encountered in real-world data integration projects. Based on a consistent set of synthetically generated demographic data that have been separated and disrupted in a controlled manner, the datasets used in the exercise are large enough (several thousand records) to provide students with a significant challenge yet small enough to be managed within a semester course using tools that will run on a desktop platform. Because the starting state of the integrated data is known, student progress in re-integrating the data can be readily and objectively measured to give students feedback on their progress and also allowing them to assess the effectiveness of different strategies and approaches they might try. The details given here are based on the experience of conducting the ER challenge on three occasions in two different courses.