Semirings, automata, languages
Semirings, automata, languages
Rational series and their languages
Rational series and their languages
A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Machine Learning
A design principles of a weighted finite-state transducer library
Theoretical Computer Science - Special issue on implementing automata
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Automata, Languages, and Machines
Automata, Languages, and Machines
Automata: Theoretic Aspects of Formal Power Series
Automata: Theoretic Aspects of Formal Power Series
Semiring frameworks and algorithms for shortest-distance problems
Journal of Automata, Languages and Combinatorics
Path kernels and multiplicative updates
The Journal of Machine Learning Research
Distribution kernels based on moments of counts
ICML '04 Proceedings of the twenty-first international conference on Machine learning
A general weighted grammar library
CIAA'04 Proceedings of the 9th international conference on Implementation and Application of Automata
A general regression technique for learning transductions
ICML '05 Proceedings of the 22nd international conference on Machine learning
Weighted decomposition kernels
ICML '05 Proceedings of the 22nd international conference on Machine learning
Fast transpose methods for kernel learning on sparse data
ICML '06 Proceedings of the 23rd international conference on Machine learning
Acoustic Modelling Using Continuous Rational Kernels
Journal of VLSI Signal Processing Systems
Fundamenta Informaticae
Sequence kernels for predicting protein essentiality
Proceedings of the 25th international conference on Machine learning
Kernel methods for learning languages
Theoretical Computer Science
Multi-stream Fusion for Speaker Classification
Speaker Classification I
3-Way Composition of Weighted Finite-State Transducers
CIAA '08 Proceedings of the 13th international conference on Implementation and Applications of Automata
String Kernels Based on Variable-Length-Don't-Care Patterns
DS '08 Proceedings of the 11th International Conference on Discovery Science
Neurocomputing
Component-based discriminative classification for hidden Markov models
Pattern Recognition
Learning with Weighted Transducers
Proceedings of the 2009 conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008
Learning Deep Architectures for AI
Foundations and Trends® in Machine Learning
Learning state machine-based string edit kernels
Pattern Recognition
Learning languages with rational kernels
COLT'07 Proceedings of the 20th annual conference on Learning theory
Learning with kernels and logical representations
Probabilistic inductive logic programming
Expected sequence similarity maximization
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
The Journal of Machine Learning Research
Large-scale training of SVMs with automata kernels
CIAA'10 Proceedings of the 15th international conference on Implementation and application of automata
Learning linearly separable languages
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
2D similarity kernels for biological sequence classification
Proceedings of the 11th International Workshop on Data Mining in Bioinformatics
Fundamenta Informaticae
Factor and subsequence kernels and signatures of rational languages
CIAA'12 Proceedings of the 17th international conference on Implementation and Application of Automata
Geometric tree kernels: classification of COPD from airway tree geometry
IPMI'13 Proceedings of the 23rd international conference on Information Processing in Medical Imaging
Rational kernels for arabic text classification
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Biological Sequence Classification with Multivariate String Kernels
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
Many classification algorithms were originally designed for fixed-size vectors. Recent applications in text and speech processing and computational biology require however the analysis of variable-length sequences and more generally weighted automata. An approach widely used in statistical learning techniques such as Support Vector Machines (SVMs) is that of kernel methods, due to their computational efficiency in high-dimensional feature spaces. We introduce a general family of kernels based on weighted transducers or rational relations, rational kernels , that extend kernel methods to the analysis of variable-length sequences or more generally weighted automata. We show that rational kernels can be computed efficiently using a general algorithm of composition of weighted transducers and a general single-source shortest-distance algorithm. Not all rational kernels are positive definite and symmetric (PDS), or equivalently verify the Mercer condition, a condition that guarantees the convergence of training for discriminant classification algorithms such as SVMs. We present several theoretical results related to PDS rational kernels. We show that under some general conditions these kernels are closed under sum, product, or Kleene-closure and give a general method for constructing a PDS rational kernel from an arbitrary transducer defined on some non-idempotent semirings. We give the proof of several characterization results that can be used to guide the design of PDS rational kernels. We also show that some commonly used string kernels or similarity measures such as the edit-distance, the convolution kernels of Haussler, and some string kernels used in the context of computational biology are specific instances of rational kernels. Our results include the proof that the edit-distance over a non-trivial alphabet is not negative definite, which, to the best of our knowledge, was never stated or proved before. Rational kernels can be combined with SVMs to form efficient and powerful techniques for a variety of classification tasks in text and speech processing, or computational biology. We describe examples of general families of PDS rational kernels that are useful in many of these applications and report the result of our experiments illustrating the use of rational kernels in several difficult large-vocabulary spoken-dialog classification tasks based on deployed spoken-dialog systems. Our results show that rational kernels are easy to design and implement and lead to substantial improvements of the classification accuracy.