Communications of the ACM - Special issue on parallelism
Instance-Based Learning Algorithms
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Tolerating Concept and Sampling Shift in Lazy Learning UsingPrediction Error Context Switching
Artificial Intelligence Review - Special issue on lazy learning
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Detecting Concept Drift with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Accurate decision trees for mining high-speed data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Learning drifting concepts: Example selection vs. example weighting
Intelligent Data Analysis
Efficient instance-based learning on data streams
Intelligent Data Analysis
A systematic analysis of performance measures for classification tasks
Information Processing and Management: an International Journal
Data Mining and Knowledge Discovery
Adaptive Learning from Evolving Data Streams
IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Improved heterogeneous distance functions
Journal of Artificial Intelligence Research
Fuzzy-UCS: a Michigan-style learning fuzzy-classifier system for supervised learning
IEEE Transactions on Evolutionary Computation
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
The Journal of Machine Learning Research
λ-Perceptron: An adaptive classifier for data streams
Pattern Recognition
Robust ensemble learning for mining noisy data streams
Decision Support Systems
A Double-Window-Based Classification Algorithm for Concept Drifting Data Streams
GRC '10 Proceedings of the 2010 IEEE International Conference on Granular Computing
Atypicity detection in data streams: A self-adjusting approach
Intelligent Data Analysis - Ubiquitous Knowledge Discovery
Classification model for data streams based on similarity
IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
Mining data streams with concept drifts using genetic algorithm
Artificial Intelligence Review
FLEXFIS: A Robust Incremental Learning Approach for Evolving Takagi–Sugeno Fuzzy Models
IEEE Transactions on Fuzzy Systems
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
Data stream classification with artificial endocrine system
Applied Intelligence
Evolving fuzzy pattern trees for binary classification on data streams
Information Sciences: an International Journal
Mining neighbor-based patterns in data streams
Information Systems
An adaptive ensemble classifier for mining concept drifting data streams
Expert Systems with Applications: An International Journal
Learning from data streams with only positive and unlabeled data
Journal of Intelligent Information Systems
Pattern discovery in data streams under the time warping distance
The VLDB Journal — The International Journal on Very Large Data Bases
Sliding window based weighted maximal frequent pattern mining over data streams
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
Incremental learning techniques have been used extensively to address the data stream classification problem. The most important issue is to maintain a balance between accuracy and efficiency, i.e., the algorithm should provide good classification performance with a reasonable time response. This work introduces a new technique, named Similarity-based Data Stream Classifier (SimC), which achieves good performance by introducing a novel insertion/removal policy that adapts quickly to the data tendency and maintains a representative, small set of examples and estimators that guarantees good classification rates. The methodology is also able to detect novel classes/labels, during the running phase, and to remove useless ones that do not add any value to the classification process. Statistical tests were used to evaluate the model performance, from two points of view: efficacy (classification rate) and efficiency (online response time). Five well-known techniques and sixteen data streams were compared, using the Friedman's test. Also, to find out which schemes were significantly different, the Nemenyi's, Holm's and Shaffer's tests were considered. The results show that SimC is very competitive in terms of (absolute and streaming) accuracy, and classification/updating time, in comparison to several of the most popular methods in the literature.