Shortest path edit distance for detecting duplicate biological entities

  • Authors:
  • Alex Rudniy;Min Song;James Geller

  • Affiliations:
  • New Jersey Institute of Technology, Newark, NJ;New Jersey Institute of Technology, Newark, NJ;New Jersey Institute of Technology, Newark, NJ

  • Venue:
  • Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel and context-sensitive Shortest Path Edit Distance (SPED) applied to duplicate entity detection in biological data. SPED is an extension of Markov Random Field-based Edit Distance. It transforms the edit distance computational problem to the calculation of the shortest path among two selected vertices of a graph. The experimental results show that SPED produces competitive outcomes. SoftSPED, the combination of SPED with TFIDF, achieves superior performance in most cases.