A bridging model for parallel computation

Authors:
Leslie G. Valiant
Affiliations:
Harvard Univ., Cambridge, MA
Venue:
Communications of the ACM
Year:
1990

Citing 18
Cited 573

Efficient Schemes for Parallel Communication

Journal of the ACM (JACM)
Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories

Acta Informatica
Routing, merging, and sorting on parallel models of computation

Journal of Computer and System Sciences
Type architectures, shared memory, and the corollary of modest potential

Annual review of computer science vol. 1, 1986
Parallel algorithmic techniques for combinatorial computation

Annual review of computer science: vol. 3, 1988
Tight bounds on the complexity of parallel sorting

IEEE Transactions on Computers
Towards an architecture-independent analysis of parallel algorithms

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Portable programming within a message-passing model: the FFT as an example

C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Optimal and sublogarithmic time randomized parallel sorting algorithms

SIAM Journal on Computing
A more practical PRAM model

SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Communication complexity of PRAMs

Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
A complexity theory of efficient parallel algorithms

Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
From on-line to batch learning

COLT '89 Proceedings of the second annual workshop on Computational learning theory
Parallel algorithms for shared-memory machines

Handbook of theoretical computer science (vol. A)
General purpose parallel architectures

Handbook of theoretical computer science (vol. A)
Parallel Prefix Computation

Journal of the ACM (JACM)
Parallel hashing: an efficient implementation of shared memory

Journal of the ACM (JACM)
Ultracomputers

ACM Transactions on Programming Languages and Systems (TOPLAS)

Architecture-Independent Parallel Computation

Computer
Combining tentative and definite executions for very fast dependable parallel computing

STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Efficient parallel algorithms on restartable fail-stop processors

PODC '91 Proceedings of the tenth annual ACM symposium on Principles of distributed computing
Methods for message routing in parallel machines

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Computing with faulty arrays

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Efficient program transformations for resilient parallel computation via randomization (preliminary version)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Multiple feedback queue as a model of general purpose multiprocessor systems

CSC '92 Proceedings of the 1992 ACM annual conference on Communications
Performance analysis of a parallel theorem prover

SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Efficient optical communication in parallel computers

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Specifying non-blocking shared memories (extended abstract)

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Compile-time analysis of communicating processes

ICS '92 Proceedings of the 6th international conference on Supercomputing
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Experiences with a model for parallel computation

PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Parallel algorithms column 1: models of computation

ACM SIGACT News
Simple, efficient shared memory simulations

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Optimal broadcast and summation in the LogP model

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Space-efficient scheduling of multithreaded computations

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
A survey of PRAM simulation techniques

ACM Computing Surveys (CSUR)
List ranking and list scan on the Cray C-90

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Modeling communication in parallel algorithms: a fruitful interaction between theory and systems?

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Efficient low-contention parallel algorithms

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Are multiport memories physically feasible?

ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Are multiport memories physically feasible?

ACM SIGARCH Computer Architecture News
Performance predictions for parallel diagonal-implicitly iterated Runge-Kutta methods

PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
A randomized parallel 3D convex hull algorithm for coarse grained multicomputers

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Accounting for memory bank contention and delay in high-bandwidth multiprocessors

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Parallel sorting with limited bandwidth

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Parallel Computing in Networks of Workstations with Paralex

IEEE Transactions on Parallel and Distributed Systems
Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing

IEEE Transactions on Parallel and Distributed Systems
Efficient Termination Detection for Loosely Synchronous Applications in Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Practical parallel algorithms for personalized communication and integer sorting

Journal of Experimental Algorithmics (JEA)
Towards efficiency and portability: programming with the BSP model

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
BSP vs LogP

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Improved methods for hiding latency in high bandwidth networks (extended abstract)

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
On multiprocessor system scheduling

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Parallel algorithms for personalized communication and sorting with an experimental study (extended abstract)

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Deterministic sorting and randomized median finding on the BSP model

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Fully dynamic search trees for an extension of the BSP model

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Efficient execution of nondeterministic parallel programs on asynchronous systems

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Communication-efficient parallel sorting (preliminary version)

STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Automatic methods for hiding latency in high bandwidth networks (extended abstract)

STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
LogP: a practical model of parallel computation

Communications of the ACM
Fast Parallel Sorting Under LogP: Experience with the CM-5

IEEE Transactions on Parallel and Distributed Systems
The Block Distributed Memory Model

IEEE Transactions on Parallel and Distributed Systems
A quantitative comparison of parallel computation models

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Strategic directions in research in theory of computing

ACM Computing Surveys (CSUR) - Special ACM 50th-anniversary issue: strategic directions in computing research
A bridging model for parallel computation, communication, and I/O

ACM Computing Surveys (CSUR) - Special issue: position statements on strategic directions in computing research
ParCeL-1: a parallel programming language based on autonomous and synchronous actors

ACM SIGPLAN Notices
Universal Wormhole Routing

IEEE Transactions on Parallel and Distributed Systems
Parallel computation still not ready for the mainstream

Communications of the ACM
Can shared-memory model serve as a bridging model for parallel computation?

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Efficient computations on fault-prone BSP machines

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Modeling parallel bandwidth: local vs. global restrictions

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Efficient external memory algorithms by simulating coarse-grained parallel algorithms

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Pipelining with futures

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
From algorithm parallelism to instruction-level parallelism: an encode-decode chain using prefix-sum

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Improved routing and sorting on multibutterflies

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
A methodology for specifying data distribution using only standard object-oriented features

ICS '97 Proceedings of the 11th international conference on Supercomputing
LoPC: modeling contention in parallel algorithms

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Billiards and related systems on the bulk-synchronous parallel model

Proceedings of the eleventh workshop on Parallel and distributed simulation
Edge Congestion of Shortest Path Systems for All-to-All Communication

IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems

IEEE Transactions on Parallel and Distributed Systems
Abstractions for Portable, Scalable Parallel Programming

IEEE Transactions on Parallel and Distributed Systems
Communication-optimal parallel minimum spanning tree algorithms (extended abstract)

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
“Dynamic-fault-prone BSP”: a paradigm for robust computations in changing environments

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract)

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Computational bounds for fundamental problems on general-purpose parallel models

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Optimizing and load balancing metacomputing applications

ICS '98 Proceedings of the 12th international conference on Supercomputing
Load balanced parallel radix sort

ICS '98 Proceedings of the 12th international conference on Supercomputing
Support for Efficient Programming on the SB-PRAM

International Journal of Parallel Programming
Models and languages for parallel computation

ACM Computing Surveys (CSUR)
A quantitative comparison of parallel computation models

ACM Transactions on Computer Systems (TOCS)
A new deterministic parallel sorting algorithm with an experimental evaluation

Journal of Experimental Algorithmics (JEA)
A Gracefully Degrading Massively Parallel System Using the BSP Model, and Its Evaluation

IEEE Transactions on Computers
Computing the Medial Axis Transform in Parallel With Eight Scan Operations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Resource Scaling Effects on MPP Performance: The STAP Benchmark Implications

IEEE Transactions on Parallel and Distributed Systems
Communication-processor tradeoffs in limited resources PRAM

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
BOS is boss: a case for bulk-synchronous object systems

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Data management in networks: experimental evaluation of a provably good strategy

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Parallel Performance Analysis of the Improved Quasi-Minimal Residual Method on Bulk Synchronous Parallel Architectures

The Journal of Supercomputing
Preemptive scheduling of parallel jobs on multiprocessors

Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Randomized fully-scalable BSP techniques for multi-searching and convex hull construction

SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
The QRQW PRAM: accounting for contention in parallel algorithms

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Emulations between QSM, BSP, and LogP: a framework for general-purpose parallel algorithm design

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Parallel virtual memory

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Portable and Efficient Parallel Computing Using the BSP Model

IEEE Transactions on Computers
Optimal Clustering of Tree-Sweep Computations for High-Latency Parallel Environments

IEEE Transactions on Parallel and Distributed Systems
Coarse grained parallel computing on heterogeneous systems

SAC '98 Proceedings of the 1998 ACM symposium on Applied Computing
Lower Bounds on Communication Loads and Optimal Placements in Torus Networks

IEEE Transactions on Computers
A general performance model for parallel sweeps on orthogonal grids for particle transport calculations

Proceedings of the 14th international conference on Supercomputing
Compression using efficient multicasting

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Locality-preserving load-balancing mechanisms for synchronous simulations on shared-memory multiprocessors

PADS '00 Proceedings of the fourteenth workshop on Parallel and distributed simulation
A Design Methodology for Data-Parallel Applications

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
A Transformation Approach to Derive Efficient Parallel Implementations

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Scatter and gather operations on an asynchronous communication model

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
A formal model for the parallel semantics of P3L

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Some remarks on parallel exponentiation (extended abstract)

ISSAC '00 Proceedings of the 2000 international symposium on Symbolic and algebraic computation
An object-oriented framework for data parallelism

ACM Computing Surveys (CSUR)
Task Allocation on a Network of Processors

IEEE Transactions on Computers
An Optical Bus-Based Distributed Dynamic Barrier Mechanism

IEEE Transactions on Computers
NestStep: Nested Parallelism and Virtual Shared Memory for the BSP Model

The Journal of Supercomputing
H-BSP: A Hierarchical BSP Computation Model

The Journal of Supercomputing
The implementation of MPI-2 one-sided communication for the NEC SX-5

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Finding a Hamiltonian paths in tournaments on clusters - a provably communication-efficient approach

Proceedings of the 2001 ACM symposium on Applied computing
Concurrent threads and optimal parallel minimum spanning trees algorithm

Journal of the ACM (JACM)
Room synchronizations

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Optimal semi-oblique tiling

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Towards practical deteministic write-all algorithms

Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Scalable isosurface visualization of massive datasets on COTS clusters

PVG '01 Proceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics
Factoring a binary polynomial of degree over one million

ACM SIGSAM Bulletin
Grid-enabled parallel divide-and-conquer: theory and practice

Proceedings of the 2002 ACM symposium on Applied computing
Communication overlap in multi-tier parallel algorithms

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Highly portable and efficient implementations of parallel adaptive N-body methods

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Topology preserving dynamic load balancing for parallel molecular simulations

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Minimizing randomness in minimum spanning tree, parallel connectivity, and set maxima algorithms

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal tiling for the RNA base pairing problem

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Parallel dynamic programming for solving the string editing problem on a CGM/BSP

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
P-3PC: A Point-to-Point Communication Model for Automatic and Optimal Decomposition of Regular Domain Problems

IEEE Transactions on Parallel and Distributed Systems
Use of a CORBA/RMI gateway: characterization of communication overhead

WOSP '02 Proceedings of the 3rd international workshop on Software and performance
Predicting the performance of synchronous discrete event simulation systems

Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Architectural differences of efficient sequential and parallel computers

Journal of Systems Architecture: the EUROMICRO Journal
Adaptive parallel computing on heterogeneous networks with mpC

Parallel Computing
Performance-steered design of software architectures for embedded multicore systems

Software—Practice & Experience
A Glossary of Parallel Computing Terminology

IEEE Parallel & Distributed Technology: Systems & Technology
Nest: A Nested-Predicate Scheme for Fault Tolerance

IEEE Transactions on Computers
Program Structuring for Effective Parallel Portability

IEEE Transactions on Parallel and Distributed Systems
On Load Balancing for Distributed Multiagent Computing

IEEE Transactions on Parallel and Distributed Systems
DNA electrophoresis studied with the cage model

Journal of Computational Physics
Work-optimal simulation of PRAM models on meshes

Nordic Journal of Computing
Symbolic Performance Modeling of Parallel Systems

IEEE Transactions on Parallel and Distributed Systems
Frequency-adaptive join for shared nothing machines

Progress in computer research
A Design of Parallel R-tree on Cluster of Workstations

DNIS '00 Proceedings of the International Workshop on Databases in Networked Information Systems
Parallel Models and Job Characterization for System Scheduling

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
A Coarse-Grained Parallel Algorithm for Maximal Cliques in Circle Graphs

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Parallel Bridging Models and Their Impact on Algorithm Design

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
On the Effectiveness of D-BSP as a Bridging Model of Parallel Computation

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Parallel Algorithm Design with Coarse-Grained Synchronization

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Skel-BSP: Performance Portability for Skeletal Programming

HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Algorithms for Generic Tools in Parallel Numerical Simulation

HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Simulating Parallel Architectures with BSPlab

HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
A Randomized Algorithm for Voronoi Diagram of Line Segments on Coarse-Grained Multiprocessors

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Interoperability of Data Parallel Runtime Libraries

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Parallel 'Go with the Winners' Algorithms in the LogP Model

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
An Out-of-Core Sorting Algorithm for Clusters with Processors at Different Speed

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A 2-D Parallel Convex Hull Algorithm with Optimal Communication Phases

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A BSP Approach to the Scheduling of Tightly-Nested Loops

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Optimizing Parallel Bitonic Sort

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
The Power of SIMDs in Real-Time Scheduling

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Distribution Sweeping on Clustered Machines with Hierarchical Memories

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
d-Dimensional Range Search on Multicomputers

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A Randomized Sorting Algorithm on the BSP model

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Algorithms for SMP-Clusters Dense Matrix-Vector Multiplication

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Buckets Strike Back: Improved Parallel Shortest Paths

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Exploiting Hierarchy in Heterogeneous Environments

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
The Heterogeneous Bulk Synchronous Parallel Model

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Implementing Shared Memory on Clustered Machines

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
HiHCoHP: Toward a Realistic Communication Model for Hierarchical HyperClusters of Heterogeneous Processors

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Hardware Implementation of PRAM and Its Performance Evaluation

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Cost Hierarchies for Abstract Parallel Machines

LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
kappa NUMA: A Model for Clusters of SMP-Machines

PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
A Language for the Complexity Analysis of Parallel Programs

PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
Transparent Parallelisation Through Reuse: Between a Compiler and a Library Approach

ECOOP '93 Proceedings of the 7th European Conference on Object-Oriented Programming
All-Pairs Shortest Paths Computation in the BSP Model

ICALP '01 Proceedings of the 28th International Colloquium on Automata, Languages and Programming,
Seamless Integration of Parallelism and Memory Hierarchy

ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Time-Sharing Parallel Jobs in the Presence of Multiple Resource Requirements

IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Improving Parallel Job Scheduling Using Runtime Measurements

IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Average-Case Communication-Optimal Parallel Parenthesis Matching

ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
BSlambdap: Functional BSP Programs on Enumerated Vectors

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
The ParCel-2 Programming Language (Research Note)

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Performance Prediction of an NAS Benchmark Program with ChronosMix Environment

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
On the Predictive Quality of BSP-like Cost Functions for NOWs

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Performance Prediction of Oblivious BSP Programs

Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Towards Formally Refining BSP Barrier s into Explicit Two-Sided Communications

Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Symbolic Cost Estimation of Parallel Applications

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Non-approximability of the Bulk Synchronous Task Scheduling Problem

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Parallel Convex Hull Computation by Generalised Regular Sampling

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Parallel Algorithms for Grounded Range Search and Applications

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Optimising Skeletal-Stream Parallelism on a BSP Computer

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Real PRAM Programming

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Condensed Graphs: A Multi-level, Parallel, Intermediate Representation

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
The Skel-BSP Global Optimizer: Enhancing Performance Portability in Parallel Programming

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Oblivious BSP (Research Note)

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Parallel Geometric Algorithms in Coarse-Grain Network Models

COCOON '98 Proceedings of the 4th Annual International Conference on Computing and Combinatorics
Computational Power of BSP Computers

SOFSEM '98 Proceedings of the 25th Conference on Current Trends in Theory and Practice of Informatics: Theory and Practice of Informatics
Pipelined Decomposable BSP Computers

SOFSEM '01 Proceedings of the 28th Conference on Current Trends in Theory and Practice of Informatics Piestany: Theory and Practice of Informatics
Parallel Image Processing System on a Cluster of Personal Computers (Best Student Paper Award: First Prize)

VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
Measuring the Performance Impact of SP-Restricted Programming in Shared-Memory Machines

VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
BSP Algorithms - Write Once, Run Anywhere

WAE '99 Proceedings of the 3rd International Workshop on Algorithm Engineering
Portable List Ranking: An Experimental Study

WAE '00 Proceedings of the 4th International Workshop on Algorithm Engineering
Graph Coloring on a Coarse Grained Multiprocessor

WG '00 Proceedings of the 26th International Workshop on Graph-Theoretic Concepts in Computer Science
Coarse Grained Parallel Algorithms for Detecting Convex Bipartite Graphs

WG '00 Proceedings of the 26th International Workshop on Graph-Theoretic Concepts in Computer Science
Intensional High Performance Computing

DCW '00 Proceedings of the Third International Workshop on Distributed Communities on the Web
A Fixpoint Theory for Non-monotonic Parallelism

CSL '02 Proceedings of the 16th International Workshop and 11th Annual Conference of the EACSL on Computer Science Logic
How to Write a Healthiness Condition

IFM '00 Proceedings of the Second International Conference on Integrated Formal Methods
Performance and Predictability of MPI and BSP Programs on the CRAY T3E

Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Handling Graphs According to a Coarse Grained Approach: Experiments with PVM and MPI

Proceedings of the 7th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Adaptive Execution of Pipelines

Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
PVM Computation of the Transitive Closure: The Dependency Graph Approach

Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Nested Bulk Synchronous Parallel Computing

Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
A pi-calculus Model of a Spanish Fish Market - Preliminary Report

AMET '98 Selected Papers from the First International Workshop on Agent Mediated Electronic Trading on Agent Mediated Electronic Commerce
A Skew-insensitive Algorithm for Join and Multi-join Operations on Shared Nothing Machines

DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Declarative Modelling of Constraint Propagation Strategies

ADVIS '00 Proceedings of the First International Conference on Advances in Information Systems
Decomposable Bulk Synchronous Parallel Computers

SOFSEM '99 Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics
Architecture Independent Analysis of Parallel Programs

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
On Stalling in LogP

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
BSP Performance Analysis and Prediction: Tools and Application

PaCT '999 Proceedings of the 5th International Conference on Parallel Computing Technologies
A Quantitative Measure of Portability with Application to Bandwidth-Latency Models for Parallel Computing

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
A Cost Model for Asynchronous and Structured Message Passing

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
The Paderborn University BSP (PUB) library

Parallel Computing
List-ranking on interconnection networks

Information and Computation
Parallel ray tracing on a chip

Practical parallel rendering
Portable and architecture independent parallel performance tuning using BSP

Parallel Computing
Randomized permutations in a coarse grained parallel environment

Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
1-Optimality of static BSP computations: scheduling independent chains as a case study

Theoretical Computer Science
SCL-chan: An Asynchronous Data-Parallel Language for Irregular Algorithms

HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Algorithm Design and Analysis Using the WPRAM Model

HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Abstracting network characteristics and locality properties of parallel systems

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Operator Design Pattern for Data Parallel Computation

TOOLS '97 Proceedings of the Tools-23: Technology of Object-Oriented Languages and Systems
Architecture independent parallel selection with applications to parallel priority queues

Theoretical Computer Science
Algorithm engineering for parallel computation

Experimental algorithmics
Gracefully Degrading Systems Using the Bulk-Synchronous Parallel Model with Randomised Shared Memory

FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Deterministic Routing of h-relations on the Multibutterfly

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Experimental Validation of Parallel Computation Models on the Intel Paragon

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Predicting the Running Times of Parallel Programs by Simulation

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Efficient Barrier Synchronization Mechanism for the BSP Model on Message-Passing Architectures

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Contention-Aware Communication Schedule for High-Speed Communication

Cluster Computing
Portable list ranking: an experimental study

Journal of Experimental Algorithmics (JEA)
Graph coloring on coarse grained multicomputers

Discrete Applied Mathematics - Special issue: The second international colloquium, "journées de l'informatique messine"
Parallel 'go with the winners' algorithms in distributed memory models

Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
Incorporating memory layout in the modeling of message passing programs

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Parallel, distributed and network-based processing
Parallel implementation of geometric shortest path algorithms

Parallel Computing - Special issue: High performance computing with geographical data
Deterministic computations on a PRAM with static processor and memory faults

Fundamenta Informaticae
Optimal broadcast on parallel locality models

Journal of Discrete Algorithms
A fixpoint theory for non-monotonic parallelism

Theoretical Computer Science
Emulations between QSM, BSP and LogP: a framework for general-purpose parallel algorithm design

Journal of Parallel and Distributed Computing
Solving large FPT problems on coarse-grained parallel machines

Journal of Computer and System Sciences - Special issue on Parameterized computation and complexity
Parallel and Distributed Haskells

Journal of Functional Programming
Logic of global synchrony

ACM Transactions on Programming Languages and Systems (TOPLAS)
Architecture independent parallel binomial tree option price valuations

Parallel Computing
Parallelism versus memory allocation in pipelined router forwarding engines

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Predicting the performance of parallel programs

Parallel Computing
Checkpointing-based rollback recovery for parallel applications on the InteGrade grid middleware

MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
Parallel and distributed simulation: managing external workload with BSP time warp

Proceedings of the 34th conference on Winter simulation: exploring new frontiers
Predicting the Performance of Synchronous Discrete Event Simulation

IEEE Transactions on Parallel and Distributed Systems
A Geometric Programming Framework for Optimal Multi-Level Tiling

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Predicting and Evaluating Distributed Communication Performance

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
BCS-MPI: A New Approach in the System Software Design for Large-Scale Parallel Computers

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Coarse grained gather and scatter operations with applications

Journal of Parallel and Distributed Computing
On performance analysis of heterogeneous parallel algorithms

Parallel Computing
Stream PRAM

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 14 - Volume 15
A Multiple Associative Model to Support Branches in Data Parallel Applications using the Manager-Worker Paradigm

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 14 - Volume 15
Productivity in High Performance Computing

International Journal of High Performance Computing Applications
On stalling in LogP

Journal of Parallel and Distributed Computing
A tight analysis and near-optimal instances of the algorithm of Anderson and Woll

Theoretical Computer Science
A Coarse-Grained multicomputer algorithm for the detection of repetitions

Information Processing Letters
Loci: a rule-based framework for parallel multi-disciplinary simulation synthesis

Journal of Functional Programming
Parallel processing

Encyclopedia of Computer Science
The design and implementation of LilyTask in shared memory

ACM SIGOPS Operating Systems Review
On deadlocks of exclusive AND-requests for resources

Distributed Computing
Cluster Computing for Determining Three-Dimensional Protein Structure

The Journal of Supercomputing
A static analysis for bulk synchronous parallel ML to avoid parallel nesting

Future Generation Computer Systems - Special issue: Parallel computing technologies
Adaptive Parallel Job Scheduling with Flexible Coscheduling

IEEE Transactions on Parallel and Distributed Systems
Lifting sequential graph algorithms for distributed-memory parallel computation

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Application Resource Requirement Estimation in a Parallel-Pipeline Model of Execution

IEEE Transactions on Parallel and Distributed Systems
The implementation of the BSP parallel computing model on the InteGrade Grid middleware

MGC '05 Proceedings of the 3rd international workshop on Middleware for grid computing
The MHETA Execution Model for Heterogeneous Clusters

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Towards a more realistic BSP cost model

HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Self-Organizing Communication-aware Resource Management for Scheduling in Grid Environment

HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
An experimental evaluation of the HP V-class and SGI origin 2000 multiprocessors using microbenchmarks and scientific applications

International Journal of Parallel Programming
Exchanging messages of different sizes

Journal of Parallel and Distributed Computing
Communication-optimal parallel parenthesis matching

Parallel Computing
Translating submachine locality into locality of reference

Journal of Parallel and Distributed Computing - Special issue: 18th International parallel and distributed processing symposium
PEMPIs: a new methodology of modeling and prediction of MPI programs performance

International Journal of Parallel Programming
Efficient automatic simulation of parallel computation on networks of workstations

Discrete Applied Mathematics
Modeling instruction placement on a spatial architecture

Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
The cache complexity of multithreaded cache oblivious algorithms

Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Designing irregular parallel algorithms with mutual exclusion and lock-free protocols

Journal of Parallel and Distributed Computing
An Accurate Communication Model of a Heterogeneous Cluster Based on a Switch-Enabled Ethernet Network

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 2
Realizing the e-science desktop peer using a peer-to-peer distributed virtual machine middleware

Proceedings of the 4th international workshop on Middleware for grid computing
A Parallel Computational Model for Heterogeneous Clusters

IEEE Transactions on Parallel and Distributed Systems
Modeling contention of sparse-matrix-vector multiplication (SMV) in three parallel programming paradigms

WOSP '07 Proceedings of the 6th international workshop on Software and performance
Linear work suffix array construction

Journal of the ACM (JACM)
Stochastic modeling and analysis of hybrid mobility in reconfigurable distributed virtual machines

Journal of Parallel and Distributed Computing
Mondriaan sparse matrix partitioning for attacking cryptosystems by a parallel block Lanczos algorithm: a case study

Parallel Computing - Algorithmic skeletons
Editorial: Introduction to the special issue on semantics and costs models for high-level parallel programming

Computer Languages, Systems and Structures
A bulk-synchronous parallel process algebra

Computer Languages, Systems and Structures
Communication-efficient parallel generic pairwise elimination

Future Generation Computer Systems - Special section: Information engineering and enterprise architecture in distributed computing environments
Coprocessor design to support MPI primitives in configurable multiprocessors

Integration, the VLSI Journal
Performance engineering, PSEs and the GRID

Scientific Programming
A tool for performance modeling of parallel programs

Scientific Programming
Remote memory access: A case for portable, efficient and library independent parallel programming

Scientific Programming
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
$\log_{\rm n}{\rm P}$ and $\log_{3}{\rm P}$: Accurate Analytical Models of Point-to-Point Communication in Distributed Systems

IEEE Transactions on Computers
Modeling contention of giga-updates per second (GUPs) in three parallel programming paradigms

PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
Parallel Scripting with Python

Computing in Science and Engineering
High performance combinatorial algorithm design on the Cell Broadband Engine processor

Parallel Computing
Load balancing distributed inverted files

Proceedings of the 9th annual ACM international workshop on Web information and data management
High-performance distributed inverted files

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Relation-based computations in a monadic BSP model

Parallel Computing
PRO: a model for the design and analysis of efficient and scalable parallel algorithms

Nordic Journal of Computing
Coarse grained parallel algorithms for graph matching

Parallel Computing
Optimistic parallelism benefits from data partitioning

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Efficient sampling of random permutations

Journal of Discrete Algorithms
A framework for scalable greedy coloring on distributed-memory parallel computers

Journal of Parallel and Distributed Computing
HPM: a hierarchical model for parallel computations

International Journal of High Performance Computing and Networking
Distributed computing and the multicore revolution

ACM SIGACT News
BSGP: bulk-synchronous GPU programming

ACM SIGGRAPH 2008 papers
A regression-based approach to scalability prediction

Proceedings of the 22nd annual international conference on Supercomputing
Optimal speedup on a low-degree multi-core parallel architecture (LoPRAM)

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Fundamental parallel algorithms for private-cache chip multiprocessors

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Cache-efficient dynamic programming algorithms for multicores

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Time-sharing parallel applications through performance-targeted feedback-controlled real-time scheduling

Cluster Computing
A framework for adaptive collective communications for heterogeneous hierarchical computing systems

Journal of Computer and System Sciences
Decomposing Verification Around End-User Features

Verified Software: Theories, Tools, Experiments
Searching and Updating Metric Space Databases Using the Parallel EGNAT

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Efficient Parallel Tree Reductions on Distributed Memory Environments

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part II
Resource Load Balancing Based on Multi-agent in ServiceBSP Model

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Exploiting Hybrid Parallelism in Web Search Engines

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Scheduling Intersection Queries in Term Partitioned Inverted Files

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A PGAS-Based Algorithm for the Longest Common Subsequence Problem

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A Search Engine Index for Multimedia Content

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A Bridging Model for Multi-core Computing

ESA '08 Proceedings of the 16th annual European symposium on Algorithms
Parallel methods for absolute irreducibility testing

The Journal of Supercomputing
High-performance priority queues for parallel crawlers

Proceedings of the 10th ACM workshop on Web information and data management
RAT: RC Amenability Test for Rapid Performance Prediction

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A unified model for multicore architectures

IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
Scalable isosurface visualization of massive datasets on commodity off-the-shelf clusters

Journal of Parallel and Distributed Computing
Parallel query processing on distributed clustering indexes

Journal of Discrete Algorithms
Mapping parallelism to multi-cores: a machine learning based approach

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Boolean circuit programming: A new paradigm to design parallel algorithms

Journal of Discrete Algorithms
Compile-Time and Run-Time Issues in an Auto-Parallelisation System for the Cell BE Processor

Euro-Par 2008 Workshops - Parallel Processing
The work of Leslie Valiant

Proceedings of the forty-first annual ACM symposium on Theory of computing
The systems edge of the Parameterized Linear Array with a Reconfigurable Pipelined Bus System (LARPBS(p)) optical bus parallel computing model

The Journal of Supercomputing
Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Rigel: an architecture and scalable programming interface for a 1000-core accelerator

Proceedings of the 36th annual international symposium on Computer architecture
Efficient parallel Text Retrieval techniques on Bulk Synchronous Parallel (BSP)/Coarse Grained Multicomputers (CGM)

The Journal of Supercomputing
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
psort, Yet Another Fast Stable Sorting Software

SEA '09 Proceedings of the 8th International Symposium on Experimental Algorithms
Bsp2omp: A Compiler For Translating Bsp Programs To Openmp

International Journal of Parallel, Emergent and Distributed Systems - Advances in Parallel and Distributed Computational Models
Speeding up genetic programming: a parallel BSP implementation

GECCO '96 Proceedings of the 1st annual conference on Genetic and evolutionary computation
Configurable emulated shared memory architecture for general purpose MP-SOCs and NOC regions

NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Performance implications of synchronization structure in parallel programming

Parallel Computing
OSL: Optimized Bulk Synchronous Parallel Skeletons on Distributed Arrays

APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Evaluating multicore algorithms on the unified memory model

Scientific Programming - Software Development for Multi-core Computing Systems
Mobile processes, mobile channels and complex dynamic systems

CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Scalable communication protocols for dynamic sparse data exchange

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Graph coloring on coarse grained multicomputers

Discrete Applied Mathematics
Efficient automatic simulation of parallel computation on networks of workstations

Discrete Applied Mathematics
A Generic Cost Model for Concurrent and Data-parallel Meta-computing

Electronic Notes in Theoretical Computer Science (ENTCS)
A static analysis for Bulk Synchronous Parallel ML to avoid parallel nesting

Future Generation Computer Systems - Special issue: Parallel computing technologies
A Coarse-Grained Multicomputer algorithm for the detection of repetitions

Information Processing Letters
Santa Claus: Formal analysis of a process-oriented solution

ACM Transactions on Programming Languages and Systems (TOPLAS)
Cortical architectures on a GPGPU

Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Algorithms for memory hierarchies: advanced lectures

Algorithms for memory hierarchies: advanced lectures
A model for estimating the performance of synchronous parallel network simulation

International Journal of Modelling and Simulation
Sync/Async parallel search for the efficient design and construction of web search engines

Parallel Computing
Application execution management on the InteGrade opportunistic grid middleware

Journal of Parallel and Distributed Computing
Simple linear work suffix array construction

ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Anahy: a programming environment for cluster computing

VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
A coarse-grained multicomputer algorithm for the longest repeated suffix ending at each point in a word

ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartII
A parallel wavefront algorithm for efficient biological sequence comparison

ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartII
Towards realistic implementations of external memory algorithms using a coarse grained paradigm

ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartII
Parallel superposition for bulk synchronous parallel ML

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
A parallel virtual machine for bulk synchronous parallel ML

ICCS'03 Proceedings of the 1st international conference on Computational science: PartI
A parallel programming environment on grid

ICCS'03 Proceedings of the 1st international conference on Computational science: PartI
Mapping unstructured applications into nested parallelism

VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
NestStepModelica: mathematical modeling and bulk-synchronous parallel simulation

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
The parXXL environment: scalable fine grained development for large coarse grained platforms

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Experiments with a parallel external memory system

HiPC'07 Proceedings of the 14th international conference on High performance computing
A parallel BSP algorithm for irregular dynamic programming

APPT'07 Proceedings of the 7th international conference on Advanced parallel processing technologies
Modeling multigrain parallelism on heterogeneous multi-core processors: a case study of the cell BE

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Divide-and-conquer parallel programming with minimally synchronous parallel ML

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Pregel: a system for large-scale graph processing

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Low depth cache-oblivious algorithms

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
New algorithms for efficient parallel string comparison

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Cohesion: a hybrid memory model for accelerators

Proceedings of the 37th annual international symposium on Computer architecture
Efficient partial-duplicate detection based on sequence matching

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Methodology for Efficient Execution of SPMD Applications on Multicore Environments

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
mPlogP: A Parallel Computation Model for Heterogeneous Multi-core Computer

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
New caching techniques for web search engines

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Parallel processing of data from very large-scale wireless sensor networks

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Estimating parallel performance, a skeleton-based approach

Proceedings of the fourth international workshop on High-level parallel programming and applications
Parallel greedy graph matching using an edge partitioning approach

Proceedings of the fourth international workshop on High-level parallel programming and applications
Hybrid bulk synchronous parallelism library for clustered smp architectures

Proceedings of the fourth international workshop on High-level parallel programming and applications
Building efficient multi-threaded search nodes

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A model of computation for MapReduce

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Revisiting Cramer's rule for solving dense linear systems

SpringSim '10 Proceedings of the 2010 Spring Simulation Multiconference
Parallel algorithms

Algorithms and theory of computation handbook
Parallel computation: models and complexity issues

Algorithms and theory of computation handbook
Parallel longest increasing subsequences in scalable time and memory

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Fast PGAS Implementation of Distributed Graph Algorithms

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Parallel selection by regular sampling

Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
A bridging model for multi-core computing

Journal of Computer and System Sciences
A middleware for parallel processing of large graphs

Proceedings of the 8th International Workshop on Middleware for Grids, Clouds and e-Science
Energy considerations for divisible load processing

PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part II
Model oriented profiling of parallel programs

EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Predictability of bulk synchronous programs using MPI

EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Specification for reactive bulk-synchronous programming

EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Groups in bulk synchronous parallel computing

EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
The parallel cellular programming model

EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Towards efficient BSP implementations of BSR programs for some computational geometry problems

EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Cache-oblivious simulation of parallel programs

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Towards a parallel framework of grid-based numerical algorithms on DAGs

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
LogfP - a model for small messages in InfiniBand

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Cost evaluation from specifications for BSP programs

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Support for adaptivity in ARMCI using migratable objects

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Algorithm engineering: bridging the gap between algorithm theory and practice

Algorithm engineering: bridging the gap between algorithm theory and practice
Piccolo: building fast, distributed programs with partitioned tables

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
3D block-based medial axis transform and chessboard distance transform based on dominance

Image and Vision Computing
Throughput-Effective On-Chip Networks for Manycore Accelerators

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Light-weight communications on Intel's single-chip cloud computer processor

ACM SIGOPS Operating Systems Review
A high-level framework for distributed processing of large-scale graphs

ICDCN'11 Proceedings of the 12th international conference on Distributed computing and networking
Kanor: a declarative language for explicit communication

PADL'11 Proceedings of the 13th international conference on Practical aspects of declarative languages
Applying process migration on a BSP-based LU decomposition application

VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
A framework for parallel genetic algorithms on PC cluster

IMCAS'06 Proceedings of the 5th WSEAS international conference on Instrumentation, measurement, circuits and systems
Parallel evaluation of conjunctive queries

Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Brief announcement: large-scale multimaps

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Scheduling irregular parallel computations on hierarchical caches

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
The round complexity of distributed sorting: extended abstract

Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Active pebbles: parallel programming for data-driven applications

Proceedings of the international conference on Supercomputing
An analytical model for multilevel performance prediction of Multi-FPGA systems

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Balance principles for algorithm-architecture co-design

HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
HipG: parallel processing of large-scale graphs

ACM SIGOPS Operating Systems Review
Spatial hardware implementation for sparse graph algorithms in GraphStep

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Performance modeling for multilevel communication in SHMEM+

Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Petri-nets as an intermediate representation for heterogeneous architectures

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Kernel-based offload of collective operations: implementation, evaluation and lessons learned

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Cache size in a cost model for heterogeneous skeletons

Proceedings of the fifth international workshop on High-level parallel programming and applications
Type system for a safe execution of parallel programs in BSML

Proceedings of the fifth international workshop on High-level parallel programming and applications
DOT: a matrix model for analyzing, optimizing and deploying software for big data analytics in distributed systems

Proceedings of the 2nd ACM Symposium on Cloud Computing
Making time-stepped applications tick in the cloud

Proceedings of the 2nd ACM Symposium on Cloud Computing
A formal programming model of Orléans skeleton library

PaCT'11 Proceedings of the 11th international conference on Parallel computing technologies
Oracle scheduling: controlling granularity in implicitly parallel languages

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
A framework for an automatic hybrid MPI+OpenMP code generation

Proceedings of the 19th High Performance Computing Symposia
The Combinatorial BLAS: design, implementation, and applications

International Journal of High Performance Computing Applications
ParallelGDB: a parallel graph database based on cache specialization

Proceedings of the 15th Symposium on International Database Engineering & Applications
Optimizing explicit data transfers for data parallel applications on the cell architecture

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Paging for multi-core shared caches

Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
Bounded arboricity to determine the local structure of sparse graphs

WG'06 Proceedings of the 32nd international conference on Graph-Theoretic Concepts in Computer Science
Bulk synchronous parallel ML: semantics and implementation of the parallel juxtaposition

CSR'06 Proceedings of the First international computer science conference on Theory and Applications
Total exchange performance modelling under network contention

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
A web computing environment for parallel algorithms in java

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Load balancing strategies in a web computing environment

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Multi-DaC programming model: a variant of multi-BSP model for divide-and-conquer algorithms

DAMP '12 Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming
PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation

Parallel Computing
Design of the force field task assignment method and associated performance evaluation for desktop grids

GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
ServiceBSP model with qos considerations in grids

APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
An index data structure for searching in metric space databases

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
A CGM algorithm solving the longest increasing subsequence problem

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Efficient longest common subsequence computation using bulk-synchronous parallelism

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Algorithmic-Parameter optimization of a parallelized split-step fourier transform using a modified BSP cost model

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
A BSP/CGM algorithm for finding all maximal contiguous subsequences of a sequence of numbers

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
A preliminary nested-parallel framework to efficiently implement scientific applications

VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
SPC-XML: a structured representation for nested-parallel programming languages

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Towards a bulk-synchronous distributed shared memory programming environment for grids

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Scheduling moldable BSP tasks

JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Advantages of backward searching — efficient secondary memory and distributed implementation of compressed suffix arrays

ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Green-Marl: a DSL for easy and efficient graph analysis

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Fast concurrency control for distributed inverted files

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Efficient parallelization of spatial approximation trees

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Dynamic memory management in the loci framework

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Bulk synchronous parallel ML: modular implementation and performance prediction

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Modeling execution time of selected computation and communication kernels on grids

EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
SIMD re-convergence at thread frontiers

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Generating c code from LOGS specifications

ICTAC'05 Proceedings of the Second international conference on Theoretical Aspects of Computing
Canal: scaling social network-based Sybil tolerance schemes

Proceedings of the 7th ACM european conference on Computer Systems
SGL: towards a bridging model for heterogeneous hierarchical platforms

International Journal of High Performance Computing and Networking
Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers

Foundations and Trends® in Machine Learning
Continuous monitoring in the dynamic sensor field model

ALGOSENSORS'11 Proceedings of the 7th international conference on Algorithms for Sensor Systems, Wireless Ad Hoc Networks and Autonomous Mobile Entities
Palovca: describing and executing graph algorithms in haskell

PADL'12 Proceedings of the 14th international conference on Practical Aspects of Declarative Languages
Sorting, searching, and simulation in the mapreduce framework

ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Design and implementation of a parallel cellular language for MIMD architectures

Computer Languages
The efficiency of mapreduce in parallel external memory

LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics
Diderot: a parallel DSL for image analysis and visualization

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
An object-oriented bulk synchronous parallel library for multicore programming

Concurrency and Computation: Practice & Experience
Survey: Computational models for networks of tiny artifacts: A survey

Computer Science Review
Space-round tradeoffs for MapReduce computations

Proceedings of the 26th ACM international conference on Supercomputing
Load Balancing Query Processing in Metric-Space Similarity Search

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Communication-optimal parallel algorithm for strassen's matrix multiplication

Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
A coarse-grained parallel algorithm for the matrix chain order problem

Proceedings of the 2012 Symposium on High Performance Computing
BC-PDM: data mining, social network analysis and text mining system based on cloud computing

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Managing large graphs on multi-cores with graph awareness

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
A novel parallel algorithm for gaussian elimination of sparse unsymmetric matrices

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
An experimental comparison of load balancing strategies in a web computing environment

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Verification of a heat diffusion simulation written with orléans skeleton library

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Counter automata for parameterised timing analysis of box-based systems

FOPARA'11 Proceedings of the Second international conference on Foundational and Practical Aspects of Resource Analysis
A yoke of oxen and a thousand chickens for heavy lifting graph processing

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Deterministic Computations on a PRAM with Static Processor and Memory Faults

Fundamenta Informaticae
A black-box approach to understanding concurrency in DaCapo

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
To boldly go: an occam-π mission to engineer emergence

Natural Computing: an international journal
Through the concurrency gateway: a challenge from the near future of graphics hardware

EG PGV'04 Proceedings of the 5th Eurographics conference on Parallel Graphics and Visualization
GraphChi: large-scale graph computation on just a PC

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Breaking the speed and scalability barriers for graph exploration on distributed-memory machines

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Bamboo: translating MPI applications to a latency-tolerant, data-driven form

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Aspen: a domain specific language for performance modeling

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Optimization principles for collective neighborhood communications

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Facilitating real-time graph mining

Proceedings of the fourth international workshop on Cloud data management
A scheduling toolkit for multiprocessor-task programming with dependencies

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Load balancing on an interactive multiplayer game server

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
A search engine accepting on-line updates

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Techniques for designing efficient parallel graph algorithms for SMPs and multicore processors

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Self--consistent MPI performance requirements

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
(Sync|Async)+ MPI search engines

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Continuous monitoring in the dynamic sensor field model

Theoretical Computer Science
Application-driven analysis of two generations of capability computing: the transition to multicore processors

Concurrency and Computation: Practice & Experience
Towards a complexity model for design and analysis of PGAS-based algorithms

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Job scheduling using successive linear programming approximations of a sparse model

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
A lower bound technique for communication on BSP with application to the FFT

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
A verified library of algorithmic skeletons on evenly distributed arrays

ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Using Pregel-like Large Scale Graph Processing Frameworks for Social Network Analysis

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Iterative parallel data processing with stratosphere: an inside look

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Minimal MapReduce algorithms

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Cumulon: optimizing statistical data analysis in the cloud

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Mizan: a system for dynamic load balancing in large-scale graph processing

Proceedings of the 8th ACM European Conference on Computer Systems
Presto: distributed machine learning and graph processing with sparse matrices

Proceedings of the 8th ACM European Conference on Computer Systems
Bandwidth-optimal all-to-all exchanges in fat tree networks

Proceedings of the 27th international ACM conference on International conference on supercomputing
Expressing graph algorithms using generalized active messages

Proceedings of the 27th international ACM conference on International conference on supercomputing
Parameterised architectural patterns for providing cloud service fault tolerance with accurate costings

Proceedings of the 16th International ACM Sigsoft symposium on Component-based software engineering
A divide and conquer approach and a work-optimal parallel algorithm for the LIS problem

Information Processing Letters
Early experiences in using a domain-specific language for large-scale graph analysis

First International Workshop on Graph Data Management Experiences and Systems
GPS: a graph processing system

Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Approximate parallel simulation of web search engines

Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Fast greedy algorithms in mapreduce and streaming

Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
G-path: flexible path pattern query on large graphs

Proceedings of the 22nd international conference on World Wide Web companion
WTF: the who to follow service at Twitter

Proceedings of the 22nd international conference on World Wide Web
Estimating parallel performance

Journal of Parallel and Distributed Computing
The von Neumann architecture is due for retirement

HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Distributed data management using MapReduce

ACM Computing Surveys (CSUR)
An improved parallel singular value algorithm and its implementation for multicore hardware

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
PAGE: a partition aware graph computation engine

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
"All roads lead to Rome": optimistic recovery for distributed iterative data processing

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Modeling synthetic aperture radar computation with Aspen

International Journal of High Performance Computing Applications
Designing on-chip networks for throughput accelerators

ACM Transactions on Architecture and Code Optimization (TACO)
Certified Information Access

Journal of Systems and Software
Analysis of partitioning strategies for graph processing in bulk synchronous parallel models

Proceedings of the fifth international workshop on Cloud data management
Leveraging transactional memory for a predictable execution of applications composed of hard real-time and best-effort tasks

Proceedings of the 21st International conference on Real-Time Networks and Systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
A lightweight infrastructure for graph analytics

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
The family of mapreduce and large-scale data processing systems

ACM Computing Surveys (CSUR)
DANBI: dynamic scheduling of irregular stream programs for many-core systems

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Extending modern PaaS clouds with BSP to execute legacy MPI applications

Proceedings of the 4th annual Symposium on Cloud Computing
Programming with BSP homomorphisms

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Giraphx: parallel yet serializable large-scale graph processing

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
BDMPI: conquering BigData with small clusters using MPI

DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
The energy case for graph processing on hybrid CPU and GPU systems

IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Making queries tractable on big data with preprocessing: through the eyes of complexity theory

Proceedings of the VLDB Endowment
Ingredients of adaptability: a survey of reconfigurable processors

VLSI Design
Scale-out NUMA

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Efficient query evaluation on distributed graphs with Hadoop environment

Proceedings of the Fourth Symposium on Information and Communication Technology
Simplifying Scalable Graph Processing with a Domain-Specific Language

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Red Fox: An Execution Environment for Relational Query Processing on GPUs

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
PREDIcT: towards predicting the runtime of large scale iterative analytics

Proceedings of the VLDB Endowment
Distributed socialite: a datalog-based language for large-scale graph analysis

Proceedings of the VLDB Endowment
Horton+: a distributed system for processing declarative reachability queries over partitioned graphs

Proceedings of the VLDB Endowment
Fast iterative graph computation with block updates

Proceedings of the VLDB Endowment
Compiling Fresh Breeze Codelets

Proceedings of Programming Models and Applications on Multicores and Manycores
Efficient Parallel Implementations of Multiple Sequence Alignment using BSP/CGM Model

Proceedings of Programming Models and Applications on Multicores and Manycores
A memory access model for highly-threaded many-core architectures

Future Generation Computer Systems
Apple-CORE: Harnessing general-purpose many-cores with hardware concurrency management

Microprocessors & Microsystems
Parallel processing of large graphs

Future Generation Computer Systems
Minimizing synchronizations in sparse iterative solvers for distributed supercomputers

Computers & Mathematics with Applications
Measurement of the latency parameters of the Multi-BSP model: a multicore benchmarking approach

The Journal of Supercomputing
Exploiting inter-operation parallelism for matrix chain multiplication using MapReduce

The Journal of Supercomputing
Is multicore hardware for general-purpose parallel processing broken?

Communications of the ACM
NEWT - A Fault Tolerant BSP Framework on Hadoop YARN

UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
A Randomized Parallel Three-Dimensional Convex Hull Algorithm for Coarse-Grained Multicomputers

Theory of Computing Systems
Modelling Search Engines Performance Using Coloured Petri Nets

Fundamenta Informaticae - Application and Theory of Petri Nets and Concurrency, 2012

Quantified Score

Hi-index	48.30

Visualization

Abstract

The success of the von Neumann model of sequential computation is attributable to the fact that it is an efficient bridge between software and hardware: high-level languages can be efficiently compiled on to this model; yet it can be effeciently implemented in hardware. The author argues that an analogous bridge between software and hardware in required for parallel computation if that is to become as widely used. This article introduces the bulk-synchronous parallel (BSP) model as a candidate for this role, and gives results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.