Indexing density models for incremental learning and anytime classification on data streams

Authors:
Thomas Seidl;Ira Assent;Philipp Kranen;Ralph Krieger;Jennifer Herrmann
Affiliations:
RWTH Aachen University, Germany;Aalborg University, Denmark;RWTH Aachen University, Germany;RWTH Aachen University, Germany;RWTH Aachen University, Germany
Venue:
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Year:
2009

Citing 21
Cited 8

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
C4.5: programs for machine learning

C4.5: programs for machine learning
Learning in the presence of concept drift and hidden contexts

Machine Learning
On state-space abstraction for anytime evaluation of Bayesian networks

ACM SIGART Bulletin
Optimal multi-step k-nearest neighbor search

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast density estimation using CF-kernel for very large databases

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Projective ART for clustering data sets in high dimensional spaces

Neural Networks
Anytime Interval-Valued Outputs for Kernel Machines: Fast Support Vector Machine Classification via Distance Geometry

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Boosting Approach to Topic Spotting on Subdialogues

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Ideal reformulation of belief networks

UAI '90 Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence
Mining complex models from arbitrarily large databases in constant time

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Interruptible anytime algorithms for iterative improvement of decision trees

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Classifying under computational resource constraints: anytime classification using probabilistic estimators

Machine Learning
Mobile Mining and Information Management in HealthNet Scenarios

MDM '08 Proceedings of the The Ninth International Conference on Mobile Data Management

Harnessing the strengths of anytime algorithms for constant data streams

Data Mining and Knowledge Discovery
Detecting outliers on arbitrary data streams using anytime approaches

Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
MC-tree: Improving Bayesian anytime classification

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Precise anytime clustering of noisy sensor data with logarithmic complexity

Proceedings of the Fifth International Workshop on Knowledge Discovery from Sensor Data
Density estimation trees

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Bulk loading hierarchical mixture models for efficient stream classification

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
AnyOut: anytime outlier detection on streaming data

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
BT*: an advanced algorithm for anytime classification

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification of streaming data faces three basic challenges: it has to deal with huge amounts of data, the varying time between two stream data items must be used best possible (anytime classification) and additional training data must be incrementally learned (anytime learning) for applying the classifier consistently to fast data streams. In this work, we propose a novel index-based technique that can handle all three of the above challenges using the established Bayes classifier on effective kernel density estimators. Our novel Bayes tree automatically generates (adapted efficiently to the individual object to be classified) a hierarchy of mixture densities that represent kernel density estimators at successively coarser levels. Our probability density queries together with novel classification improvement strategies provide the necessary information for very effective classification at any point of interruption. Moreover, we propose a novel evaluation method for anytime classification using Poisson streams and demonstrate the anytime learning performance of the Bayes tree.