Inference of k-Testable Languages in the Strict Sense and Application to Syntactic Pattern Recognition

  • Authors:
  • P. García;E. Vidal

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 1990

Quantified Score

Hi-index 0.15

Visualization

Abstract

The inductive inference of the class of k-testable languages in the strict sense (k-TSSL) is considered. A k-TSSL is essentially defined by a finite set of substrings of length k that are permitted to appear in the strings of the language. Given a positive sample R of strings of an unknown language, a deterministic finite-state automation that recognizes the smallest k-TSSL containing R is obtained. The inferred automation is shown to have a number of transitions bounded by O(m) where m is the number of substrings defining this k-TSSL, and the inference algorithm works in O(kn log m) where n is the sum of the lengths of all the strings in R. The proposed methods are illustrated through syntactic pattern recognition experiments in which a number of strings generated by ten given (source) non-k-TSSL grammars are used to infer ten k-TSSL stochastic automata, which are further used to classify new strings generated by the same source grammars. The results of these experiments are consistent with the theory and show the ability of (stochastic) k-TSSLs to approach other classes of regular languages.