On parallel processing of aggregate and scalar functions in object-relational DBMS
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Business applications of data mining
Communications of the ACM - Evolving data mining into solutions for insights
Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications
Data Mining and Knowledge Discovery
Computational Statistics & Data Analysis - Nonlinear methods and data mining
Using SQL to Build New Aggregates and Extenders for Object- Relational Systems
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Integration of Data Mining with Database Technology
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient Mining for Association Rules with Relational Database Systems
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Scalable Mining for Classification Rules in Relational Databases
IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
Efficient Evaluation of Queries with Mining Predicates
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A bi-level Bernoulli scheme for database sampling
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
COMBI-operator - database support for data mining applications
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A probabilistic estimation framework for predictive modeling analytics
IBM Systems Journal
A grid-based approach for enterprise-scale data mining
Future Generation Computer Systems - Special section: Data mining in grid computing environments
A grid-based approach for enterprise-scale data mining
Future Generation Computer Systems - Special section: Data mining in grid computing environments
Using Data Mining Algorithms in Web Performance Prediction
Cybernetics and Systems
Hi-index | 0.00 |
A methodology for embedding predictive modeling algorithms in a commercial parallel database is described; specifically, the parallel editions of IBM DB2 Universal Database, although many aspects of the overall approach can be used with other commercial parallel databases. This parallelization approach was implemented in the Version 8.2 release of DB2 Intelligent Miner Modeling to support a new predictive modeling algorithm called Transform Regression. This database-embedded mining algorithm provides all the usual benefits, including easier integration into large enterprise applications, the ability to perform entire data mining workflows directly from an SQL-based programming interface, reduced data transfer costs between the database and the data mining application, and faster, parallel data access during query processing. However, in addition to the these benefits, a significant part of the data mining computations are also parallelized without the use of any sophisticated parallel programming constructs, or any specialized message passing and parallel synchronization libraries.