C4.5: programs for machine learning
C4.5: programs for machine learning
Improved query performance with variant indexes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
Integrating association rule mining with relational database systems: alternatives and implications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
An Extension to SQL for Mining Association Rules
Data Mining and Knowledge Discovery
MSQL: A Query Language for Database Mining
Data Mining and Knowledge Discovery
RainForest—A Framework for Fast Decision Tree Construction of Large Datasets
Data Mining and Knowledge Discovery
Machine Learning
Model 204 Architecture and Performance
Proceedings of the 2nd International Workshop on High Performance Transaction Systems
A Scalable Constant-Memory Sampling Algorithm for Pattern Discovery in Large Databases
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Decision Tables: Scalable Classification Exploring RDBMS Capabilities
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Decision Tree Modeling with Relational Views
ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Data Organization and Access for Efficient Data Mining
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Optimization of a language for data mining
Proceedings of the 2003 ACM symposium on Applied computing
Design and application of hybrid intelligent systems
Efficient Integration of Data Mining Techniques in Database Management Systems
IDEAS '04 Proceedings of the International Database Engineering and Applications Symposium
A native extension of SQL for mining data streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Integrating K-Means Clustering with a Relational DBMS Using SQL
IEEE Transactions on Knowledge and Data Engineering
ATLAS: a small but complete SQL extension for data mining and data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
An XML-enabled data mining query language: XML-DMQL
International Journal of Business Intelligence and Data Mining
International Journal of Business Intelligence and Data Mining
Integrating pattern mining in relational databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Bitmap index-based decision trees
ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
Hi-index | 0.00 |
Great efforts have been achieved to apply data mining algorithms onto large databases. However, long processing times remain a practical issue. This paper presents a framework to offer to database users online operators for mining large databases without size limit, in acceptable processing times. First, we integrate decision tree algorithms directly into database management systems. We are thus only limited by disc capacity and not by main memory. However, disc accesses still induce long response times. Hence, we propose two optimisations in a second step: reducing the size of the learning database by building its corresponding contingency table and reducing the number of database accesses by exploiting bitmap indices. Thus, the various decision tree based methods we implemented within Oracle deal with contingency tables or bitmap indices rather than with the whole training set. Experimentations performed show the efficiency of our integrated methods.