Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
SPIDER: Software for Protein Identification from Sequence Tags with De Novo Sequencing Error
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Hi-index | 0.00 |
In computational proteomics, the peptide identification via interpreting its tandem mass spectrum is an important issue. The classification of b and y ions in the spectrum plays a vital role for improving the accuracy of most existing algorithms. To solve this problem, a classification method based on frequent pattern mining and decision tree is proposed in this paper. First a dataset is established by use of the identified spectrum in which each datum records the ion positions around an ion with b or y type. The discriminative ion frequent patterns (DIFP) of b and y ions are mined with the dataset. And then a decision tree model organizing these DIFPs is proposed for classifying the b and y ions. Finally, we develop an algorithm for the b and y ions classification called B/Y-Classifier. The experimental results demonstrate that an accuracy level of 92% is achieved.