PSST... The Probabilistic Sequence Search Tool
BIBE '01 Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Hi-index | 0.00 |
Determining protein sequence similarity is an important task for protein classification and homology detection. Typically this may be done using sequence alignment algorithms, yet fast and accurate alignment-free kernel based classifiers exist. Viewing sequences as a “bag of words”, we test a simple weighted string kernel, investigating the effects of k-mer length, sequence length and choice of weighting. We also extend the kernel to operate on the k-mer frequency representation of a sequence rather than the “bag of words” representation.