Haplotyping as perfect phylogeny: conceptual framework and efficient solutions
Proceedings of the sixth annual international conference on Computational biology
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Dataset Generator for Whole Genome Shotgun Sequencing
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
SNPs Problems, Complexity, and Algorithms
ESA '01 Proceedings of the 9th Annual European Symposium on Algorithms
Integrating constraints and metric learning in semi-supervised clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Clustering documents into a web directory for bootstrapping a supervised classification
Data & Knowledge Engineering - Special issue: WIDM 2003
Technical comment: A clustering algorithm based on two distance functions for MEC model
Computational Biology and Chemistry
Computational Biology and Chemistry
International Journal of Data Mining and Bioinformatics
Algorithmica - Parameterized and Exact Algorithms
Bioinformatics
A semi-supervised approach to projected clustering with applications to microarray data
International Journal of Data Mining and Bioinformatics
Clustering sequences by overlap
International Journal of Data Mining and Bioinformatics
Haplotype assembly from aligned weighted SNP fragments
Computational Biology and Chemistry
Two phase semi-supervised clustering using background knowledge
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Hi-index | 0.00 |
Haplotype assembly is to infer a pair of haplotypes from localized polymorphism data. In this paper, a semi-supervised clustering algorithmSSK (Semi-Supervised K-means) is proposed for it, which, to our knowledge, is the first semi-supervised clustering method for it. In SSK, some positive information is firstly extracted. The information is then used to help k-means to cluster all SNP fragments into two sets from which two haplotypes can be reconstructed. The performance of SSK is tested on both real data and simulated data. The results show that it outperforms several state-of-the-art algorithms on Minimum Error Correction (MEC) model.