Fast perceptron decision tree learning from evolving data streams

Authors:
Albert Bifet;Geoff Holmes;Bernhard Pfahringer;Eibe Frank
Affiliations:
University of Waikato, Hamilton, New Zealand;University of Waikato, Hamilton, New Zealand;University of Waikato, Hamilton, New Zealand;University of Waikato, Hamilton, New Zealand
Venue:
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Year:
2010

Citing 14
Cited 10

Using Model Trees for Classification

Machine Learning
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Enlarging the Margins in Perceptron Decision Trees

Machine Learning
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Experimental comparisons of online and batch versions of bagging and boosting

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey

Data Mining and Knowledge Discovery
Accurate decision trees for mining high-speed data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Logistic Model Trees

Machine Learning
Learning Model Trees from Data Streams

DS '08 Proceedings of the 11th International Conference on Discovery Science
New ensemble methods for evolving data streams

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cloud Computing, A Practical Approach

Cloud Computing, A Practical Approach
Regression Trees from Data Streams with Drift Detection

DS '09 Proceedings of the 12th International Conference on Discovery Science
Stress-testing hoeffding trees

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

Leveraging bagging for evolving data streams

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Sentiment knowledge discovery in twitter streaming data

DS'10 Proceedings of the 13th international conference on Discovery science
Ensembles of Restricted Hoeffding Trees

ACM Transactions on Intelligent Systems and Technology (TIST)
Mining uncertain data streams using clustering feature decision trees

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Batch-incremental versus instance-incremental learning in dynamic and evolving data

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
On evaluating stream learning algorithms

Machine Learning
RCD: A recurring concept drift framework

Pattern Recognition Letters
Efficient data stream classification via probabilistic adaptive windows

Proceedings of the 28th Annual ACM Symposium on Applied Computing
A survey on concept drift adaptation

ACM Computing Surveys (CSUR)
Large margin principle in hyperrectangle learning

Neurocomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Mining of data streams must balance three evaluation dimensions: accuracy, time and memory Excellent accuracy on data streams has been obtained with Naive Bayes Hoeffding Trees—Hoeffding Trees with naive Bayes models at the leaf nodes—albeit with increased runtime compared to standard Hoeffding Trees In this paper, we show that runtime can be reduced by replacing naive Bayes with perceptron classifiers, while maintaining highly competitive accuracy We also show that accuracy can be increased even further by combining majority vote, naive Bayes, and perceptrons We evaluate four perceptron-based learning strategies and compare them against appropriate baselines: simple perceptrons, Perceptron Hoeffding Trees, hybrid Naive Bayes Perceptron Trees, and bagged versions thereof We implement a perceptron that uses the sigmoid activation function instead of the threshold activation function and optimizes the squared error, with one perceptron per class value We test our methods by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples.