The String-to-String Correction Problem
Journal of the ACM (JACM)
An Extension of the String-to-String Correction Problem
Journal of the ACM (JACM)
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Bioinformatics
Detecting Duplicate Biological Entities Using Markov Random Field-Based Edit Distance
BIBM '08 Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine
Hi-index | 0.00 |
This paper presents a novel and context-sensitive Shortest Path Edit Distance (SPED) applied to duplicate entity detection in biological data. SPED is an extension of Markov Random Field-based Edit Distance. It transforms the edit distance computational problem to the calculation of the shortest path among two selected vertices of a graph. The experimental results show that SPED produces competitive outcomes. SoftSPED, the combination of SPED with TFIDF, achieves superior performance in most cases.