The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
The paper considers some performance issues for a software application, which performs identification of duplicated records in a customer information database. The selected approaches, logic and algorithms are discussed. Some of the essential papers in the area are overviewed. The problems and the expected performance gain are debated.