Learning linearly separable languages

Authors:
Leonid Kontorovich;Corinna Cortes;Mehryar Mohri
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Google Research, New York, NY;Google Research, New York, NY
Venue:
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Year:
2006

Citing 13
Cited 5

Predicting {0,1}-functions on randomly drawn points

COLT '88 Proceedings of the first annual workshop on Computational learning theory
A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The minimum consistent DFA problem cannot be approximated within any polynomial

Journal of the ACM (JACM)
Efficient learning of typical finite automata from random walks

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
An introduction to computational learning theory

An introduction to computational learning theory
Support-Vector Networks

Machine Learning
Formal languages: an introduction and a synopsis

Handbook of formal languages, vol. 1
On the learnability and usage of acyclic probabilistic finite automata

Journal of Computer and System Sciences - Special issue on the eighth annual workshop on computational learning theory, July 5–8, 1995
Generalization performance of support vector machines and other pattern classifiers

Advances in kernel methods
Inference of Reversible Languages

Journal of the ACM (JACM)
Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks

IEEE Transactions on Pattern Analysis and Machine Intelligence
Piecewise testable events

Proceedings of the 2nd GI Conference on Automata Theory and Formal Languages
Rational Kernels: Theory and Algorithms

The Journal of Machine Learning Research

Kernel methods for learning languages

Theoretical Computer Science
Learning languages with rational kernels

COLT'07 Proceedings of the 20th annual conference on Learning theory
Some Alternatives to Parikh Matrices Using String Kernels

Fundamenta Informaticae
On the learnability of shuffle ideals

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
On the learnability of shuffle ideals

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel paradigm for learning languages that consists of mapping strings to an appropriate high-dimensional feature space and learning a separating hyperplane in that space. It initiates the study of the linear separability of automata and languages by examining the rich class of piecewise-testable languages. It introduces a high-dimensional feature map and proves piecewise-testable languages to be linearly separable in that space. The proof makes use of word combinatorial results relating to subsequences. It also shows that the positive definite kernel associated to this embedding can be computed in quadratic time. It examines the use of support vector machines in combination with this kernel to determine a separating hyperplane and the corresponding learning guarantees. It also proves that all languages linearly separable under a regular finite cover embedding, a generalization of the embedding we used, are regular.