Accelerating text mining workloads in a MapReduce-based distributed GPU environment

Authors:
Peter Wittek;SáNdor DaráNyi
Affiliations:
-;-
Venue:
Journal of Parallel and Distributed Computing
Year:
2013

Citing 40
Cited 2

Generalized vector spaces model in information retrieval

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Self-Organizing Maps

Self-Organizing Maps
Latent dirichlet allocation

The Journal of Machine Learning Research
Building Nutch: Open Source Search

Queue - Search Engines
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Integrating data and text mining processes for digital library applications

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Scalability of the Nutch search engine

Proceedings of the 21st annual international conference on Supercomputing
High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters

Proceedings of the 21st annual international conference on Supercomputing
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Adapting a message-driven parallel application to GPU-accelerated clusters

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Mars: a MapReduce framework on graphics processors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Graphical Processing Units for Quantum Chemistry

Computing in Science and Engineering
Browsing a Large Collection of Community Photos Based on Similarity on GPU

ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing, Part II
OpenMP to GPGPU: a compiler framework for automatic translation and optimization

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Predictive Runtime Code Scheduling for Heterogeneous Architectures

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Using graphics processors for high performance IR query processing

Proceedings of the 18th international conference on World wide web
Clustering billions of data points using GPUs

Proceedings of the combined workshops on UnConventional high performance computing workshop plus memory access workshop
Parallel latent semantic analysis using a graphics processing unit

Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation with Nvidia CUDA Compatible Devices

IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
Singular value decomposition on GPU using CUDA

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Scalable language processing algorithms for the masses: a case study in computing word co-occurrence matrices with MapReduce

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Supervised semantic indexing

Proceedings of the 18th ACM conference on Information and knowledge management
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Best-effort semantic document search on GPUs

Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
State-of-the-art in heterogeneous computing

Scientific Programming
Programming Massively Parallel Processors: A Hands-on Approach

Programming Massively Parallel Processors: A Hands-on Approach
Hybrid Map Task Scheduling for GPU-Based Heterogeneous Clusters

CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
A Chunking Method for Euclidean Distance Matrix Calculation on Large Dataset Using Multi-GPU

ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications
Phoenix++: modular MapReduce for shared-memory systems

Proceedings of the second international workshop on MapReduce and its applications
Productive cluster programming with OmpSs

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Parallelizing BLAST and SOM Algorithms with MapReduce-MPI Library

IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Multi-GPU MapReduce on GPU Clusters

IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
A Fast Algorithm for Constructing Inverted Files on Heterogeneous Platforms

IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Mahout in Action

Mahout in Action
Mont-Blanc: towards energy-efficient HPC systems

Proceedings of the 9th conference on Computing Frontiers
Empowering Visual Categorization With the GPU

IEEE Transactions on Multimedia
MapReduce in MPI for Large-scale graph algorithms

Parallel Computing
Self organization of a massive document collection

IEEE Transactions on Neural Networks
OpenACC: first experiences with real-world applications

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Dandelion: a compiler and runtime for heterogeneous systems

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scientific computations have been using GPU-enabled computers successfully, often relying on distributed nodes to overcome the limitations of device memory. Only a handful of text mining applications benefit from such infrastructure. Since the initial steps of text mining are typically data intensive, and the ease of deployment of algorithms is an important factor in developing advanced applications, we introduce a flexible, distributed, MapReduce-based text mining workflow that performs I/O-bound operations on CPUs with industry-standard tools and then runs compute-bound operations on GPUs which are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050s attached to each, and we achieve considerable speedups for random projection and self-organizing maps.