Approximate clustering of fingerprint vectors with missing values

  • Authors:
  • Andres Figueroa;Avraham Goldstein;Tao Jiang;Maciej Kurowski;Andrzej Lingas;Mia Persson

  • Affiliations:
  • University of California Riverside, Riverside, CA;Yeshiva University, New York, NY;University of California Riverside, Riverside, CA;Warsaw University, Banacha, Warsaw, Poland;Lund University, Lund, Sweden;Malmö University College, Malmö, Sweden

  • Venue:
  • CATS '05 Proceedings of the 2005 Australasian symposium on Theory of computing - Volume 41
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the problem of clustering fingerprints with at most p missing values (CMV (p) for short) naturally arising in oligonucleotide fingerprinting, which is an efficient method for characterizing DNA clone libraries.We show that already CMV(2) is NP-hard. We also show that a greedy algorithm yields a min(1 + ln n, 2+pln l) approximation for CMV(p), and can be implemented to run in O(nl2p) time. Furthermore, we introduce other variants of the problem of clustering fingerprints with at most p missing values based on slightly different optimization criteria and show that they can be approximated in polynomial time with ratios 22p-1 and 2(1 - [EQUATION]), respectively.