A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Space-Economical Construction of Index Structures for All Suffixes of a String
MFCS '02 Proceedings of the 27th International Symposium on Mathematical Foundations of Computer Science
Discovering Best Variable-Length-Don't-Care Patterns
DS '02 Proceedings of the 5th International Conference on Discovery Science
The Minimum DAWG for All Suffixes of a String and Its Applications
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Text classification using string kernels
The Journal of Machine Learning Research
Rational Kernels: Theory and Algorithms
The Journal of Machine Learning Research
Learning languages with rational kernels
COLT'07 Proceedings of the 20th annual conference on Learning theory
Unsupervised spam detection based on string alienness measures
DS'07 Proceedings of the 10th international conference on Discovery science
Finding patterns with variable length gaps or don’t cares
COCOON'06 Proceedings of the 12th annual international conference on Computing and Combinatorics
Hi-index | 0.00 |
We propose a new string kernel based on variable-length-don't-care patterns(VLDC patterns). A VLDC pattern is an element of (Σ茂戮驴 { 茂戮驴 })*, where Σis an alphabet and 茂戮驴 is the variable-length-don't-care symbol that matches any string in Σ*. The number of VLDC patterns matching a given string sof length nis O(22n). We present an O(n5 ) algorithm for computing the kernel value. We also propose variations of the kernel which modify the relative weights of each pattern. We evaluate our kernels using a support vector machine to classify spam data.