A noise model on learning sets of strings

  • Authors:
  • Yasubumi Sakakibara;Rani Siromoney

  • Affiliations:
  • International Institute for Advanced Study of Social Information Science (IIAS-SIS), Fujitsu Laboratories Ltd., 140, Miyamoto, Numazu, Shizuoka 410-03, Japan;Madras Christian College, Tambaram, Madras 600 059, India

  • Venue:
  • COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we introduce a new noise model on learning sets of strings in the framework of PAC learning and consider the effect of the noise on learning. The instance domain is the set &Sgr;n of strings over a finite alphabet &Sgr;, and the examples are corrupted by purely random errors affecting only the instances (and not the labels). We consider three types of errors on instances, called EDIT operation errors. EDIT operations consist of “insertion”, “deletion”, and “change” of a symbol in a string. We call such a noise where the examples are corrupted by random errors of EDIT operations on instances the EDIT noise. First we show general upper bounds on the EDIT noise rate that a learning algorithm of taking the strategy of minimizing disagreements can tolerate and a learning algorithm can tolerate. Next we present an efficient algorithm that can learn a class of decision lists over the attributes “a string w contains a pattern p?” from noisy examples under some restriction on the EDIT noise rate.