Learning drifting concepts: Example selection vs. example weighting

Authors:
Ralf Klinkenberg
Affiliations:
University of Dortmund, Computer Science Department, Artificial Intelligence Unit (LS VIII), 44221 Dortmund, Germany. E-mail: Ralf.Klinkenberg@cs.uni-dortmund.de/ URL: http://www-ai.cs.uni-dortmun ...
Venue:
Intelligent Data Analysis
Year:
2004

Citing 12
Cited 72

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Learning time-varying concepts

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Experience with a learning personal assistant

Communications of the ACM
Tracking Drifting Concepts By Minimizing Disagreements

Machine Learning - Special issue on computational learning theory
Learning in the presence of concept drift and hidden contexts

Machine Learning
Incremental relevance feedback for information filtering

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
An adaptive Web page recommendation service

AGENTS '97 Proceedings of the first international conference on Autonomous agents
Handling concept drifts in incremental learning with support vector machines

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Detecting Concept Drift with Support Vector Machines

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Incremental Learning with Support Vector Machines

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Estimating the Generalization Performance of an SVM Efficiently

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning

Automatic Feature Extraction for Classifying Audio Data

Machine Learning
Incremental rule learning based on example nearness from numerical data streams

Proceedings of the 2005 ACM symposium on Applied computing
Learning decision trees from dynamic data streams

Proceedings of the 2005 ACM symposium on Applied computing
Data streams classification by incremental rule learning with parameterized generalization

Proceedings of the 2006 ACM symposium on Applied computing
Tackling concept drift by temporal inductive transfer

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
YALE: rapid prototyping for complex data mining tasks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Applying lazy learning algorithms to tackle concept drift in spam filtering

Expert Systems with Applications: An International Journal
Decision trees for mining data streams

Intelligent Data Analysis
Incremental discretization, application to data with concept drift

Proceedings of the 2007 ACM symposium on Applied computing
Neighborhood Property--Based Pattern Selection for Support Vector Machines

Neural Computation
A framework for generating data to simulate changing environments

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Dynamic integration of classifiers for handling concept drift

Information Fusion
A robust and flexible model of hierarchical self-organizing maps for non-stationary environments

Neurocomputing
Boosting classifiers for drifting concepts

Intelligent Data Analysis - Knowlegde Discovery from Data Streams
An active learning system for mining time-changing data streams

Intelligent Data Analysis
Efficient instance-based learning on data streams

Intelligent Data Analysis
Learning in Environments with Unknown Dynamics: Towards more Robust Concept Learners

The Journal of Machine Learning Research
Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts

The Journal of Machine Learning Research
Catching the Drift: Using Feature-Free Case-Based Reasoning for Spam Filtering

ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Online Outlier Detection Based on Relative Neighbourhood Dissimilarity

WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Exploiting temporal contexts in text classification

Proceedings of the 17th ACM conference on Information and knowledge management
Unsupervised Classifier Selection Based on Two-Sample Test

DS '08 Proceedings of the 11th International Conference on Discovery Science
An adaptive personalized news dissemination system

Journal of Intelligent Information Systems
Issues in evaluation of stream learning algorithms

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning from Data Streams: Synopsis and Change Detection

Proceedings of the 2008 conference on STAIRS 2008: Proceedings of the Fourth Starting AI Researchers' Symposium
Boosting expert ensembles for rapid concept recall

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Ambiguous decision trees for mining concept-drifting data streams

Pattern Recognition Letters
A model updating strategy for predicting time series with seasonal patterns

Applied Soft Computing
Regression Trees from Data Streams with Drift Detection

DS '09 Proceedings of the 12th International Conference on Discovery Science
Transfer Learning beyond Text Classification

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Learning, detecting, understanding, and predicting concept changes

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
A case-based technique for tracking concept drift in spam filtering

Knowledge-Based Systems
An efficient algorithm for instance-based learning on data streams

ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
To better handle concept change and noise: a cellular automata approach to data stream classification

AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Change detection in learning histograms from data streams

EPIA'07 Proceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence
Temporally-aware algorithms for document classification

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Dynamic financial distress prediction using instance selection for the disposal of concept drift

Expert Systems with Applications: An International Journal
Adaptive classifiers with ICI-based adaptive knowledge base management

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
Handling drifts and shifts in on-line data streams with evolving fuzzy systems

Applied Soft Computing
Rapid behavior adaptation for human-centered robots in a dynamic environment based on the integration of primitive confidences on multi-sensor elements

Artificial Life and Robotics
Mining concept-drifting data streams containing labeled and unlabeled instances

IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
Incremental learning with multi-level adaptation

Neurocomputing
A robust incremental learning method for non-stationary environments

Neurocomputing
Editorial: Classifying text streams by keywords using classifier ensemble

Data & Knowledge Engineering
Distributed learning with data reduction

Transactions on computational collective intelligence IV
Bayesian approach to the pattern recognition problem in nonstationary environment

PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
A unifying view on dataset shift in classification

Pattern Recognition
Concurrent semi-supervised learning of data streams

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Time stamping in the presence of latency and drift

ICAIS'11 Proceedings of the Second international conference on Adaptive and intelligent systems
An efficient continuous attributes handling method for mining concept-drifting data streams based on skip list

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part I
Pattern change discovery between high dimensional data sets

Proceedings of the 20th ACM international conference on Information and knowledge management
Learning about the learning process

IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X
Learning with local drift detection

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Monitoring incremental histogram distribution for change detection in data streams

Sensor-KDD'08 Proceedings of the Second international conference on Knowledge Discovery from Sensor Data
Detecting change via competence model

ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development
An ensemble method for incremental classification in stationary and non-stationary environments

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Exploring classification concept drift on a large news text corpus

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Kernel-based selective ensemble learning for streams of trees

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
2011 Special Issue: A just-in-time adaptive classification system based on the intersection of confidence intervals rule

Neural Networks
Improving tweet stream classification by detecting changes in word probability

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
User real-time interest prediction based on collaborative filtering and interactive computing for academic recommendation

ICIC'12 Proceedings of the 8th international conference on Intelligent Computing Theories and Applications
Tracking concept drift in malware families

Proceedings of the 5th ACM workshop on Security and artificial intelligence
A new method of mining data streams using harmony search

Journal of Intelligent Information Systems
Handling time changing data with adaptive very fast decision rules

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Temporal contexts: Effective text classification in evolving document collections

Information Systems
A rank-one update method for least squares linear discriminant analysis with concept drift

Pattern Recognition
On evaluating stream learning algorithms

Machine Learning
A survey on concept drift adaptation

ACM Computing Surveys (CSUR)
A similarity-based approach for data stream classification

Expert Systems with Applications: An International Journal
Concept drift detection via competence models

Artificial Intelligence
Classifying evolving data streams with partially labeled data

Intelligent Data Analysis
Tracking recurrent concepts using context

Intelligent Data Analysis - Combined Learning Methods and Mining Complex Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

For many learning tasks where data is collected over an extended period of time, its underlying distribution is likely to change. A typical example is information filtering, i.e. the adaptive classification of documents with respect to a particular user interest. Both the interest of the user and the document content change over time. A filtering system should be able to adapt to such concept changes. This paper proposes several methods to handle such concept drifts with support vector machines. The methods either maintain an adaptive time window on the training data [13], select representative training examples, or weight the training examples [15]. The key idea is to automatically adjust the window size, the example selection, and the example weighting, respectively, so that the estimated generalization error is minimized. The approaches are both theoretically well-founded as well as effective and efficient in practice. Since they do not require complicated parameterization, they are simpler to use and more robust than comparable heuristics. Experiments with simulated concept drift scenarios based on real-world text data compare the new methods with other window management approaches. We show that they can effectively select an appropriate window size, example selection, and example weighting, respectively, in a robust way. We also explain how the proposed example selection and weighting approaches can be turned into incremental approaches. Since most evaluation methods for machine learning, like e.g. cross-validation, assume that the examples are independent and identically distributed, which is clearly unrealistic in the case of concept drift, alternative evaluation schemes are used to estimate and optimize the performance of each learning step within the concept drift handling frameworks as well as to evaluate and compare the different frameworks.