Consistent query answers in inconsistent databases
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Quality Measures in Uncertain Data Management
SUM '07 Proceedings of the 1st international conference on Scalable Uncertainty Management
Modeling and querying possible repairs in duplicate detection
Proceedings of the VLDB Endowment
Framework for evaluating clustering algorithms in duplicate detection
Proceedings of the VLDB Endowment
An Introduction to Duplicate Detection
An Introduction to Duplicate Detection
Evaluating entity resolution results
Proceedings of the VLDB Endowment
On-the-fly entity-aware query processing in the presence of linkage
Proceedings of the VLDB Endowment
Entity Resolution and Information Quality
Entity Resolution and Information Quality
Probabilistic Databases
Hi-index | 0.00 |
Duplicate detection is an important process for cleaning or integrating data. Since real-life data is often polluted, detecting duplicates usually comes along with uncertainty. To handle duplicate uncertainty in an appropriate way, indeterministic duplicate detection approaches, i.e. approaches in which ambiguous duplicate decisions are probabilistically modeled in the resultant data, have been developed. To rate the goodness of a duplicate detection approach, its detection results need to be evaluated in their quality. In this paper, we propose several semantics to apply traditional quality evaluation measures to indeterministic duplicate detection results and exemplarily present an efficient evaluation for one of these semantics. Finally, we present some experimental results.