A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
BOAT—optimistic decision tree construction
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continually evaluating similarity-based pattern queries on a streaming time series
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Wavelet synopsis for data streams: minimizing non-euclidean error
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Combining proactive and reactive predictions for data streams
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
On Reducing Classifier Granularity in Mining Concept-Drifting Data Streams
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Multi-dimensional regression analysis of time-series data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Proceedings of the 2008 ACM symposium on Applied computing
Non-stationary data sequence classification using online class priors estimation
Pattern Recognition
Intervention Events Detection and Prediction in Data Streams
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Efficient Detection of Discords for Time Series Stream
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
An Aggregate Ensemble for Mining Concept Drifting Data Streams with Noise
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Stream data clustering based on grid density and attraction
ACM Transactions on Knowledge Discovery from Data (TKDD)
Density-based clustering of data streams at multiple resolutions
ACM Transactions on Knowledge Discovery from Data (TKDD)
Unsupervised change analysis using supervised learning
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
An algorithmic approach to event summarization
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Transfer estimation of evolving class priors in data stream classification
Pattern Recognition
An efficient approach for mining segment-wise intervention rules in time-series streams
WAIM'10 Proceedings of the 11th international conference on Web-age information management
The impact of latency on online classification learning with concept drift
KSEM'10 Proceedings of the 4th international conference on Knowledge science, engineering and management
Active learning from stream data using optimal weight classifier ensemble
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Finding semantics in time series
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Time stamping in the presence of latency and drift
ICAIS'11 Proceedings of the Second international conference on Adaptive and intelligent systems
Recentness biased learning for time series forecasting
Information Sciences: an International Journal
Hi-index | 0.00 |
Mining data streams of changing class distributions is important for real-time business decision support. The stream classifier must evolve to reflect the current class distribution. This poses a serious challenge. On the one hand, relying on historical data may increase the chances of learning obsolete models. On the other hand, learning only from the latest data may lead to biased classifiers, as the latest data is often an unrepresentative sample of the current class distribution. The problem is particularly acute in classifying rare events, when, for example, instances of the rare class do not even show up in the most recent training data. In this paper, we use a stochastic model to describe the concept shifting patterns and formulate this problem as an optimization one: from the historical and the current training data that we have observed, find the most-likely current distribution, and learn a classifier based on the most-likely distribution. We derive an analytic solution and approximate this solution with an efficient algorithm, which calibrates the influence of historical data carefully to create an accurate classifier. We evaluate our algorithm with both synthetic and real-world datasets. Our results show that our algorithm produces accurate and efficient classification.