Novelty detection with application to data streams

  • Authors:
  • Eduardo J. Spinosa;André Ponce de Leon F. de Carvalho;João Gama

  • Affiliations:
  • (Correspd. E-mail: ejspin@icmc.usp.br (or ejspin@yahoo.com)) University of São Paulo (USP), Institute of Mathematical and Computer Sciences (ICMC), Caixa Postal 668, 13560-970, São Carlo ...;University of São Paulo (USP), Institute of Mathematical and Computer Sciences (ICMC), Caixa Postal 668, 13560-970, São Carlos, SP, Brazil;University of Porto (UP), Laboratory of Artificial Intelligence and Decision Support (LIAAD), Rua de Ceuta, 118, 6° 4150-190, Porto, Portugal

  • Venue:
  • Intelligent Data Analysis - Knowledge Discovery from Data Streams
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents and evaluates an approach to novelty detection that addresses it as the problem of identifying novel concepts in a continuous learning scenario, as an extension to a single-class classification problem. OLINDDA, an OnLIne Novelty and Drift Detection Algorithm that implements this approach, uses efficient standard clustering algorithms to continuously generate candidate clusters among examples that were not explained by the current known concepts. Clusters complying with a validation criterion that takes cohesiveness and representativeness into account are initially identified as concepts. By merging similar concepts, OLINDDA may enhance the representation of some concepts as it advances toward its final goal of describing novel emerging concepts in an unsupervised way. The proposed approach is experimentally evaluated by the use of several measures taken throughout the learning process. Results show that it is capable of identifying novel concepts that are pure and correspond to real classes, disregarding unrepresentative clusters and outliers.