Exploiting temporal contexts in text classification
Proceedings of the 17th ACM conference on Information and knowledge management
Temporally-aware algorithms for document classification
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
SEMCCO'11 Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part I
Hi-index | 0.01 |
An adaptive boosting ensemble algorithm for classifying homogeneous distributed data streams is presented. The method builds an ensemble of classifiers by using Genetic Programming (GP) to inductively generate decision trees, each trained on different parts of the distributed training set. The approach adopts a co-evolutionary platform to sup- port a cooperative model of GP. A change detection strat- egy, based on self-similarity of the ensemble behavior, and measured by its fractal dimension, permits to capture time- evolving trends and patterns in the stream, and to reveal changes in evolving data streams. The approach tracks on- line ensemble accuracy deviation over time and decides to recompute the ensemble if the deviation has exceeded a pre- specified threshold. This allows the maintenance of an ac- curate and up-to-date ensemble of classifiers for continuous flows of data with concept drifts. Experimental results on a real life data set show the validity of the approach.