Resource-aware kernel density estimators over streaming data
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Sampling streaming data with replacement
Computational Statistics & Data Analysis
A framework for estimating complex probability density structures in data streams
Proceedings of the 17th ACM conference on Information and knowledge management
Bayesian classifiers based on kernel density estimation: Flexible classifiers
International Journal of Approximate Reasoning
Fast online estimation of the joint probability distribution
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Online anomaly detection using KDE
GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
Propagation of densities of streaming data within query graphs
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Clustering of high dimensional data streams
SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications
Efficient estimation of dynamic density functions with an application to outlier detection
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Density estimation is a costly operation for computingdistribution information of data sets underlying many important data mining applications, such as clustering andbiased sampling. However, traditional density estimationmethods are inapplicable for streaming data, which arecontinuously arriving large volume of data, because of theirrequest for linear storage and square size calculation. Theshortcoming limits the application of many existing effective algorithms on data streams, for which the mining problem is an emergency for applications and a challenge forresearch. In this paper, the problem of computing densityfunctions over data streams is examined. A novel methodattacking this shortcoming of existing methods is developedto enable density estimation for large volume of data in linear time, fixed size memory, and without lose of accuracy .The method is based on M-Kernel merging, so that limited kernel functions to be maintained are determined intelligently. The application of the new method on differentstreaming data models is discussed, and the result of intensive experiments is presented. The analytical and empirical result show that this new density estimation algorithmfor data streams can calculate density functions on demandat any time with high accuracy for different streaming datamodels.