Fast perceptron decision tree learning from evolving data streams

  • Authors:
  • Albert Bifet;Geoff Holmes;Bernhard Pfahringer;Eibe Frank

  • Affiliations:
  • University of Waikato, Hamilton, New Zealand;University of Waikato, Hamilton, New Zealand;University of Waikato, Hamilton, New Zealand;University of Waikato, Hamilton, New Zealand

  • Venue:
  • PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Mining of data streams must balance three evaluation dimensions: accuracy, time and memory Excellent accuracy on data streams has been obtained with Naive Bayes Hoeffding Trees—Hoeffding Trees with naive Bayes models at the leaf nodes—albeit with increased runtime compared to standard Hoeffding Trees In this paper, we show that runtime can be reduced by replacing naive Bayes with perceptron classifiers, while maintaining highly competitive accuracy We also show that accuracy can be increased even further by combining majority vote, naive Bayes, and perceptrons We evaluate four perceptron-based learning strategies and compare them against appropriate baselines: simple perceptrons, Perceptron Hoeffding Trees, hybrid Naive Bayes Perceptron Trees, and bagged versions thereof We implement a perceptron that uses the sigmoid activation function instead of the threshold activation function and optimizes the squared error, with one perceptron per class value We test our methods by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples.