Learning Balls of Strings with Correction Queries

  • Authors:
  • Leonor Becerra Bonache;Colin Higuera;Jean-Christophe Janodet;Frédéric Tantini

  • Affiliations:
  • Research Group on Mathematical Linguistics, Rovira i Virgili University, Pl. Imperial Tárraco 1, 43005 Tarragona, Spain;Laboratoire Hubert Curien, Université Jean Monnet, 18 rue du Professeur Benoút Lauras, 42000 Saint-Étienne, France;Laboratoire Hubert Curien, Université Jean Monnet, 18 rue du Professeur Benoút Lauras, 42000 Saint-Étienne, France;Laboratoire Hubert Curien, Université Jean Monnet, 18 rue du Professeur Benoút Lauras, 42000 Saint-Étienne, France

  • Venue:
  • ECML '07 Proceedings of the 18th European conference on Machine Learning
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

During the 80's, Angluin introduced an active learning paradigm, using an Oracle, capable of answering both membership and equivalence queries. However, practical evidence tends to show that if the former are often available, this is usually not the case of the latter. We propose new queries, called correction queries, which we study in the framework of Grammatical Inference. When a string is submitted to the Oracle, either she validates it if it belongs to the target language, or she proposes a correction, i.e., a string of the language close to the query with respect to the edit distance. We also introduce a non-standard class of languages: The topological balls of strings. We show that this class is not learnable in Angluin's Matmodel, but is with a linear number of correction queries. We conduct several experiments with an Oracle simulating a human Expert, and show that our algorithm is resistant to approximate answers.