The nature of statistical learning theory
The nature of statistical learning theory
Computational geometry: algorithms and applications
Computational geometry: algorithms and applications
A fast bit-vector algorithm for approximate string matching based on dynamic programming
Journal of the ACM (JACM)
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
A Sense of Self for Unix Processes
SP '96 Proceedings of the 1996 IEEE Symposium on Security and Privacy
Text classification using string kernels
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Fast String Kernels using Inexact Matching for Protein Sequences
The Journal of Machine Learning Research
Efficient Computation of Gapped Substring Kernels on Large Alphabets
The Journal of Machine Learning Research
Protein homology detection using string alignment kernels
Bioinformatics
Using gap-insensitive string kernel to detect masquerading
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
High-order Markov kernels for intrusion detection
Neurocomputing
Hi-index | 0.01 |
In recent years, several approaches to computing the value of p-length gap-weighted kernel have been presented, such as trie-based approach, suffix kernel, and range sum technique. Although other approaches can achieve better performance in some cases, suffix kernel technique is still an efficient approach in this context. In this paper, we present a series of dynamic programming algorithms based on suffix kernel to compute gapped string kernels. Given strings s and t, and a gap penalty @l, all-length gap-weighted kernel can be calculated in time O(|s||t|) with our algorithms. Moreover, some new string kernels belonging to the family of gapped string kernels are presented, including all-length and p-length match-weighted kernels, and their variants. Based on the suffix kernel technique, we can compute all-length match-weighted kernel in time O(|s||t|), and then p-length kernel in time O(p|s||t|) using the relationship between all-length and p-length kernels. Furthermore, for p-length match-weighted kernel and its variant, a bit-parallel technique is used to reduce the complexity from O(p|s||t|) to O(@?pk/w@?|s||t|), where w is the word size of the machine (e.g. 32 or 64 in practice) and k is determined by the longest matching subsequence of two strings s and t. The empirical results suggest that the suffix kernel technique is an important and useful approach to computing gapped string kernels.