A note on phase transitions and computational pitfalls of learning from sequences

  • Authors:
  • Antoine Cornuéjols;Michèle Sebag

  • Affiliations:
  • AgroParisTech / INRA UMR 518, Mathématiques et Informatique Appliquées, Paris, France F75005;Laboratoire de Recherche en Informatique, CNRS UMR 8623, Bât.490, Université de Paris-Sud, Orsay, France 91405

  • Venue:
  • Journal of Intelligent Information Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

An ever greater range of applications call for learning from sequences. Grammar induction is one prominent tool for sequence learning, it is therefore important to know its properties and limits. This paper presents a new type of analysis for inductive learning. A few years ago, the discovery of a phase transition phenomenon in inductive logic programming proved that fundamental characteristics of the learning problems may affect the very possibility of learning under very general conditions. We show that, in the case of grammatical inference, while there is no phase transition when considering the whole hypothesis space, there is a much more severe "gap" phenomenon affecting the effective search space of standard grammatical induction algorithms for deterministic finite automata (DFA). Focusing on standard search heuristics, we show that they overcome this difficulty to some extent, but that they are subject to overgeneralization. The paper last suggests some directions to alleviate this problem.