Generalization and specialization strategies for learning r.e. languages

  • Authors:
  • Sanjay Jain;Arun Sharma

  • Affiliations:
  • Department of Information Systems and Computer Science, National University of Singapore, Singapore 119260, Republic of Singapore E-mail: sanjay@iscs.nus.edu.sg;School of Computer Science and Engineering, The University of New South Wales, Sydney, NSW 2052, Australia E-mail: arun@cse.unsw.edu.au

  • Venue:
  • Annals of Mathematics and Artificial Intelligence
  • Year:
  • 1998

Quantified Score

Hi-index 0.02

Visualization

Abstract

Overgeneralization is a major issue in the identification of grammars for formal languages from positive data. Different formulations of generalization and specialization strategies have been proposed to address this problem, and recently there has been a flurry of activity investigating such strategies in the context of indexed families of recursive languages. The present paper studies the power of these strategies to learn recursively enumerable languages from positive data. In particular, the power of strong‐monotonic, monotonic, and weak‐monotonic (together with their dual notions modeling specialization) strategies are investigated for identification of r.e. languages. These investigations turn out to be different from the previous investigations on learning indexed families of recursive languages and at times require new proof techniques. A complete picture is provided for the relative power of each of the strategies considered. An interesting consequence is that the power of weak‐monotonic strategies is equivalent to that of conservative strategies. This result parallels the scenario for indexed classes of recursive languages. It is also shown that any identifiable collection of r.e. languages can also be identified by a strategy that exhibits the dual of weak‐monotonic property. An immediate consequence of the proof of this result is that if attention is restricted to infinite r.e. languages, then conservative strategies can identify every identifiable collection.