Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices

  • Authors:
  • Guang Li;Yadong Wang;Xiaohong Su

  • Affiliations:
  • School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, People's Republic of China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, People's Republic of China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, People's Republic of China

  • Venue:
  • Computer Methods and Programs in Biomedicine
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated.