Improving the performance of an incremental algorithm driven by error margins

  • Authors:
  • José/ del Campo-Á/vila;Gonzalo Ramos-Jimé/nez;Joã/o Gama;Rafael Morales-Bueno

  • Affiliations:
  • (Correspd. Tel.: +34 95 213 28 63/ Fax: +34 95 213 13 97/ E-mail: jcampo@lcc.uma.es) Department of Languages and Computer Science, Universidad de Má/laga, E.T.S.I. Informá/tica, Campus de ...;Department of Languages and Computer Science, Universidad de Má/laga, E.T.S.I. Informá/tica, Campus de Teatinos, 29071 Má/laga, Spain;Laboratory of Artificial Intelligence and Computer Science and Faculty of Economics, University of Porto, Rua de Ceuta, 118, 6, 4150-190 Porto, Portugal;Department of Languages and Computer Science, Universidad de Má/laga, E.T.S.I. Informá/tica, Campus de Teatinos, 29071 Má/laga, Spain

  • Venue:
  • Intelligent Data Analysis - Knowledge Discovery from Data Streams
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Classification is a quite relevant task within data analysis field. This task is not a trivial task and different difficulties can arise depending on the nature of the problem. All these difficulties can become worse when the datasets are too large or when new information can arrive at any time. Incremental learning is an approach that can be used to deal with the classification task in these cases. It must alleviate, or solve, the problem of limited time and memory resources. One emergent approach uses concentration bounds to ensure that decisions are made when enough information supports them. IADEM is one of the most recent algorithms that use this approach. The aim of this paper is to improve the performance of this algorithm in different ways: simplifying the complexity of the induced models, adding the ability to deal with continuous data, improving the detection of noise, selecting new criteria for evolutionating the model, including the use of more powerful prediction techniques, etc. Besides these new properties, the new system, IADEM-2, preserves the ability to obtain a performance similar to standard learning algorithms independently of the datasets size and it can incorporate new information as the basic algorithm does: using short time per example.