Real-World Applications of Evolutionary Computing, EvoWorkshops 2000: EvoIASP, EvoSCONDI, EvoTel, EvoSTIM, EvoROB, and EvoFlight
BLAST
Annotation of metagenome short reads using proxygenes
Bioinformatics
Efficient RNAi-based gene family knockdown via set cover optimization
Artificial Intelligence in Medicine
Evidence-Based Clustering of Reads and Taxonomic Analysis of Metagenomic Data
PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
SIMCOMP: a hybrid soft clustering of metagenome reads
PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
A two-way multi-dimensional mixture model for clustering metagenomic sequences
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Hi-index | 0.03 |
This paper proposes a new knowledge-based method for clustering metagenome short reads. The method incorporates biological knowledge in the clustering process, by means of a list of proteins associated to each read. These proteins are chosen from a reference proteome database according to their similarity with the given read, as evaluated by BLAST. We introduce a scoring function for weighting the resulting proteins and use them for clustering reads. The resulting clustering algorithm performs automatic selection of the number of clusters, and generates possibly overlapping clusters of reads. Experiments on real-life benchmark datasets show the effectiveness of the method for reducing the size of a metagenome dataset while maintaining a high accuracy of organism content.