On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data
Fast incremental maintenance of approximate histograms
ACM Transactions on Database Systems (TODS)
Probabilistic discovery of time series motifs
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Khiops: A Statistical Discretization Method of Continuous Attributes
Machine Learning
Necessary and Sufficient Pre-processing in Numerical Range Discretization
Knowledge and Information Systems
Clustering Distributed Sensor Data Streams
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Learning from Data Streams: Synopsis and Change Detection
Proceedings of the 2008 conference on STAIRS 2008: Proceedings of the Fourth Starting AI Researchers' Symposium
Change detection in learning histograms from data streams
EPIA'07 Proceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence
Maintaining optimal multi-way splits for numerical attributes in data streams
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Stable rankings for different effort models
Automated Software Engineering
Clustering distributed sensor data streams using local processing and reduced communication
Intelligent Data Analysis - Ubiquitous Knowledge Discovery
The inductive software engineering manifesto: principles for industrial data mining
Proceedings of the International Workshop on Machine Learning Technologies in Software Engineering
Monitoring incremental histogram distribution for change detection in data streams
Sensor-KDD'08 Proceedings of the Second international conference on Knowledge Discovery from Sensor Data
Kernel-based selective ensemble learning for streams of trees
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Data stream clustering: A survey
ACM Computing Surveys (CSUR)
A lossy counting based approach for learning on streams of graphs on a budget
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Finding conclusion stability for selecting the best effort predictor in software effort estimation
Automated Software Engineering
Hi-index | 0.00 |
In this paper we propose a new method to perform incremental discretization. The basic idea is to perform the task in two layers. The first layer receives the sequence of input data and keeps some statistics on the data using many more intervals than required. Based on the statistics stored by the first layer, the second layer creates the final discretization. The proposed architecture processes streaming examples in a single scan, in constant time and space even for infinite sequences of examples. We experimentally demonstrate that incremental discretization is able to maintain the performance of learning algorithms in comparison to a batch discretization. The proposed method is much more appropriate in incremental learning, and in problems where data flows continuously, as in most of the recent data mining applications.