Prototype-based mining of numeric data streams

Authors:
Francisco Ferrer-Troyano;Jesús S. Aguilar-Ruiz;José C. Riquelme
Affiliations:
University of Seville, Av. Reina Mercedes S/N, Seville, Spain;University of Seville, Av. Reina Mercedes S/N, Seville, Spain;University of Seville, Av. Reina Mercedes S/N, Seville, Spain
Venue:
Proceedings of the 2003 ACM symposium on Applied computing
Year:
2003

Citing 12
Cited 0

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
BOAT—optimistic decision tree construction

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
An efficient algorithm to update large itemsets with early pruning

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A Survey of Methods for Scaling Up Inductive Algorithms

Data Mining and Knowledge Discovery
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning

Data Mining and Knowledge Discovery
SLIQ: A Fast Scalable Classifier for Data Mining

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Clustering data streams

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Great organizations collect open-ended and time-changing data received at a high speed. The possibility of extracting useful knowledge from these potentially infinite databases is a new challenge in Data Mining. In this paper we propose an anytime incremental learning algorithm for mining numeric data streams. Within Supervised Learning, our approach is based on prototypes and hypercubic decision rules, concerning with the simplicity of the model provided and the time complexity as primary goals. Experimental results with synthetic databases of 100 gigabytes show a good performance from streams of data in continuous transformation.