Mapping data mining algorithms on a GPU architecture: a study

Authors:
Ana Gainaru;Emil Slusanschi;Stefan Trausan-Matu
Affiliations:
University Politehnica of Bucharest, Romania and University of Illinois at Urbana-Champaign;University Politehnica of Bucharest, Romania;University Politehnica of Bucharest, Romania
Venue:
ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
Year:
2011

Citing 9
Cited 0

Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
An Efficient Density-based Approach for Data Mining Tasks

Knowledge and Information Systems
Optimization of frequent itemset mining on multiple-core processor

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Top 10 algorithms in data mining

Knowledge and Information Systems
Efficient K-Means Clustering Using Accelerated Graphics Processors

DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
OpenMP to GPGPU: a compiler framework for automatic translation and optimization

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Scalable clustering using graphics processors

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Divergence Estimation of Continuous Distributions Based on Data-Dependent Partitions

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining algorithms are designed to extract information from a huge amount of data in an automatic way. The datasets that can be analysed with these techniques are gathered from a variety of domains, from business related fields to HPC and supercomputers. The datasets continue to increase at an exponential rate, so research has been focusing on parallelizing different data mining techniques. Recently, GPU hybrid architectures are starting to be used for this task. However the data transfer rate between CPU and GPU is a bottleneck for the applications dealing with large data entries exhibiting numerous dependencies. In this paper we analyse how efficient data mining algorithms can be mapped on these architectures by extracting the common characteristics of these methods and by looking at the communication patterns between the main memory and the GPU's shared memory. We propose an experimental study for the performance of memory systems on GPU architectures when dealing with data mining algorithms and we also advance performance model guidelines based on the observations.