Artificial intelligence: a modern approach
Artificial intelligence: a modern approach
Data preparation for data mining
Data preparation for data mining
ACM Computing Surveys (CSUR)
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
Database System Implementation
Database System Implementation
Digital Image Processing
Database Systems: The Complete Book
Database Systems: The Complete Book
RSES and RSESlib - A Collection of Tools for Rough Set Computations
RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
Unified Modeling Language User Guide, The (2nd Edition) (Addison-Wesley Object Technology Series)
Unified Modeling Language User Guide, The (2nd Edition) (Addison-Wesley Object Technology Series)
YALE: rapid prototyping for complex data mining tasks
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Data Streams: Models and Algorithms (Advances in Database Systems)
Data Streams: Models and Algorithms (Advances in Database Systems)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Operating System Concepts
Data Mining and Knowledge Discovery
Learning from Data Streams: Processing Techniques in Sensor Networks
Learning from Data Streams: Processing Techniques in Sensor Networks
Algorithms for Sparse Linear Classifiers in the Massive Data Setting
The Journal of Machine Learning Research
Online Learning of Complex Prediction Problems Using Simultaneous Projections
The Journal of Machine Learning Research
Absolute contrasts in face detection with adaboost cascade
RSKT'07 Proceedings of the 2nd international conference on Rough sets and knowledge technology
IEEE Transactions on Signal Processing
TunedIT.org: system for automated evaluation of algorithms in repeatable experiments
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Sequential pattern mining from stream data
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Hi-index | 0.00 |
This paper introduces Debellor (www.debellor.org) --- an open source extensible data mining platform with stream-based architecture, where all data transfers between elementary algorithms take the form of a stream of samples. Data streaming enables implementation of scalable algorithms, which can efficiently process large volumes of data, exceeding available memory. This is very important for data mining research and applications, since the most challenging data mining tasks involve voluminous data, either produced by a data source or generated at some intermediate stage of a complex data processing network. Advantages of data streaming are illustrated by experiments with clustering time series. The experimental results show that even for moderate-size data sets streaming is indispensable for successful execution of algorithms, otherwise the algorithms run hundreds times slower or just crash due to memory shortage. Stream architecture is particularly useful in such application domains as time series analysis, image recognition or mining data streams. It is also the only efficient architecture for implementation of online algorithms. The algorithms currently available on Debellor platform include all classifiers from Rseslib and Weka libraries and all filters from Weka.