Info-fuzzy algorithms for mining dynamic data streams

Authors:
Lior Cohen;Gil Avrahami;Mark Last;Abraham Kandel
Affiliations:
Ben-Gurion University of the Negev, Department of Information Systems Engineering, Beer-Sheva 84105, Israel;Ben-Gurion University of the Negev, Department of Information Systems Engineering, Beer-Sheva 84105, Israel;Ben-Gurion University of the Negev, Department of Information Systems Engineering, Beer-Sheva 84105, Israel;Department of Computer Science and Engineering, University of South-Florida, Tampa, FL 33620, USA
Venue:
Applied Soft Computing
Year:
2008

Citing 22
Cited 10

Elements of information theory

Elements of information theory
C4.5: programs for machine learning

C4.5: programs for machine learning
Tracking Drifting Concepts By Minimizing Disagreements

Machine Learning - Special issue on computational learning theory
Learning in the presence of concept drift and hidden contexts

Machine Learning
From data mining to knowledge discovery: an overview

Advances in knowledge discovery and data mining
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Knowledge Discovery and Data Mining: The Info-Fuzzy Network (Ifn) Methodology

Knowledge Discovery and Data Mining: The Info-Fuzzy Network (Ifn) Methodology
Induction of Decision Trees

Machine Learning
Mining complex models from arbitrarily large databases in constant time

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate decision trees for mining high-speed data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Visualizing concept drift

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A Compact and Accurate Model for Classification

IEEE Transactions on Knowledge and Data Engineering
Introduction

Communications of the ACM - Wireless sensor networks
On demand classification of data streams

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
A note on the utility of incremental learning

AI Communications
Online classification of nonstationary data streams

Intelligent Data Analysis
StreamMiner: a classifier ensemble-based engine to mine concept-drifting data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Real-time data mining of non-stationary data streams from sensor networks

Information Fusion
Adaptation and interaction in dynamical systems: Modelling and rule discovery through evolving connectionist systems

Applied Soft Computing

Online hybrid traffic classifier for Peer-to-Peer systems based on network processors

Applied Soft Computing
Concept-based evidential reasoning for multimodal fusion in human-computer interaction

Applied Soft Computing
An Integrated Knowledge Adaption Framework for Case-Based Reasoning Systems

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part II
Neural visualization of network traffic data for intrusion detection

Applied Soft Computing
Increasing availability of industrial systems through data stream mining

Computers and Industrial Engineering
Detecting change via competence model

ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development
Kernel-based selective ensemble learning for streams of trees

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Dynamic rough clustering and its applications

Applied Soft Computing
A fuzzy coherent rule mining algorithm

Applied Soft Computing
Concept drift detection via competence models

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most data-mining algorithms assume static behavior of the incoming data. In the real world, the situation is different and most continuously collected data streams are generated by dynamic processes, which may change over time, in some cases even drastically. The change in the underlying concept, also known as concept drift, causes the data-mining model generated from past examples to become less accurate and relevant for classifying the current data. Most online learning algorithms deal with concept drift by generating a new model every time a concept drift is detected. On one hand, this solution ensures accurate and relevant models at all times, thus implying an increase in the classification accuracy. On the other hand, this approach suffers from a major drawback, which is the high computational cost of generating new models. The problem is getting worse when a concept drift is detected more frequently and, hence, a compromise in terms of computational effort and accuracy is needed. This work describes a series of incremental algorithms that are shown empirically to produce more accurate classification models than the batch algorithms in the presence of a concept drift while being computationally cheaper than existing incremental methods. The proposed incremental algorithms are based on an advanced decision-tree learning methodology called ''Info-Fuzzy Network'' (IFN), which is capable to induce compact and accurate classification models. The algorithms are evaluated on real-world streams of traffic and intrusion-detection data.