Massively parallel in-database predictions using PMML

Authors:
Kaushik K. Das;Eugene Fratkin;Aleksander Gorajek;Konstantinos Stathatos;Maulin Gajjar
Affiliations:
EMC/Greenplum, San Mateo, CA, USA;EMC/Greenplum, San Mateo, CA, USA;EMC/Greenplum, San Mateo, CA, USA;Zementis, Inc., San Diego, CA, USA;Zementis, Inc., San Diego, CA, USA
Venue:
Proceedings of the 2011 workshop on Predictive markup language modeling
Year:
2011

Citing 3
Cited 0

Efficient deployment of predictive analytics through open standards and cloud computing

ACM SIGKDD Explorations Newsletter
MAD skills: new analysis practices for big data

Proceedings of the VLDB Endowment
PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics

PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Like all open standards, the Predictive Model Markup Language (PMML) enables interoperability and portability in the world of data mining and predictive analytics. This means that models developed in any environment and tool set can be deployed and used in a completely different system. Such a level of flexibility creates new opportunities for addressing exceedingly demanding business agility and performance requirements. One of these requirements is the urgent need to apply the power of predictive analytics to derive reliable predictions and, hence, business decisions from vast amounts of data collected by many organizations. In this paper, we discuss how PMML enables embedding advanced predictive models directly into the database or the data warehouse, along side the actual data to be scored. More importantly, we show how we can easily take advantage of highly parallel database architectures to efficiently derive predictions from very large volumes of data.