Efficient Schemes for Parallel Communication
Journal of the ACM (JACM)
Routing, merging, and sorting on parallel models of computation
Journal of Computer and System Sciences
Type architectures, shared memory, and the corollary of modest potential
Annual review of computer science vol. 1, 1986
Parallel algorithmic techniques for combinatorial computation
Annual review of computer science: vol. 3, 1988
Tight bounds on the complexity of parallel sorting
IEEE Transactions on Computers
Towards an architecture-independent analysis of parallel algorithms
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Portable programming within a message-passing model: the FFT as an example
C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Optimal and sublogarithmic time randomized parallel sorting algorithms
SIAM Journal on Computing
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Communication complexity of PRAMs
Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
A complexity theory of efficient parallel algorithms
Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
From on-line to batch learning
COLT '89 Proceedings of the second annual workshop on Computational learning theory
Parallel algorithms for shared-memory machines
Handbook of theoretical computer science (vol. A)
General purpose parallel architectures
Handbook of theoretical computer science (vol. A)
Journal of the ACM (JACM)
Parallel hashing: an efficient implementation of shared memory
Journal of the ACM (JACM)
ACM Transactions on Programming Languages and Systems (TOPLAS)
Combining tentative and definite executions for very fast dependable parallel computing
STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Efficient parallel algorithms on restartable fail-stop processors
PODC '91 Proceedings of the tenth annual ACM symposium on Principles of distributed computing
Methods for message routing in parallel machines
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Multiple feedback queue as a model of general purpose multiprocessor systems
CSC '92 Proceedings of the 1992 ACM annual conference on Communications
Performance analysis of a parallel theorem prover
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Efficient optical communication in parallel computers
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Specifying non-blocking shared memories (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Compile-time analysis of communicating processes
ICS '92 Proceedings of the 6th international conference on Supercomputing
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Experiences with a model for parallel computation
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Parallel algorithms column 1: models of computation
ACM SIGACT News
Simple, efficient shared memory simulations
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Optimal broadcast and summation in the LogP model
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Space-efficient scheduling of multithreaded computations
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
A survey of PRAM simulation techniques
ACM Computing Surveys (CSUR)
List ranking and list scan on the Cray C-90
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Modeling communication in parallel algorithms: a fruitful interaction between theory and systems?
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Efficient low-contention parallel algorithms
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Are multiport memories physically feasible?
ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Are multiport memories physically feasible?
ACM SIGARCH Computer Architecture News
Performance predictions for parallel diagonal-implicitly iterated Runge-Kutta methods
PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
A randomized parallel 3D convex hull algorithm for coarse grained multicomputers
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Accounting for memory bank contention and delay in high-bandwidth multiprocessors
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Parallel sorting with limited bandwidth
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Parallel Computing in Networks of Workstations with Paralex
IEEE Transactions on Parallel and Distributed Systems
Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing
IEEE Transactions on Parallel and Distributed Systems
Efficient Termination Detection for Loosely Synchronous Applications in Multicomputers
IEEE Transactions on Parallel and Distributed Systems
Practical parallel algorithms for personalized communication and integer sorting
Journal of Experimental Algorithmics (JEA)
Towards efficiency and portability: programming with the BSP model
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Improved methods for hiding latency in high bandwidth networks (extended abstract)
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
On multiprocessor system scheduling
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Deterministic sorting and randomized median finding on the BSP model
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Fully dynamic search trees for an extension of the BSP model
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Efficient execution of nondeterministic parallel programs on asynchronous systems
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Communication-efficient parallel sorting (preliminary version)
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Automatic methods for hiding latency in high bandwidth networks (extended abstract)
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
LogP: a practical model of parallel computation
Communications of the ACM
Fast Parallel Sorting Under LogP: Experience with the CM-5
IEEE Transactions on Parallel and Distributed Systems
The Block Distributed Memory Model
IEEE Transactions on Parallel and Distributed Systems
A quantitative comparison of parallel computation models
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Strategic directions in research in theory of computing
ACM Computing Surveys (CSUR) - Special ACM 50th-anniversary issue: strategic directions in computing research
A bridging model for parallel computation, communication, and I/O
ACM Computing Surveys (CSUR) - Special issue: position statements on strategic directions in computing research
IEEE Transactions on Parallel and Distributed Systems
Parallel computation still not ready for the mainstream
Communications of the ACM
Can shared-memory model serve as a bridging model for parallel computation?
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Efficient computations on fault-prone BSP machines
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Modeling parallel bandwidth: local vs. global restrictions
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Efficient external memory algorithms by simulating coarse-grained parallel algorithms
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
From algorithm parallelism to instruction-level parallelism: an encode-decode chain using prefix-sum
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Improved routing and sorting on multibutterflies
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
A methodology for specifying data distribution using only standard object-oriented features
ICS '97 Proceedings of the 11th international conference on Supercomputing
LoPC: modeling contention in parallel algorithms
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Billiards and related systems on the bulk-synchronous parallel model
Proceedings of the eleventh workshop on Parallel and distributed simulation
Edge Congestion of Shortest Path Systems for All-to-All Communication
IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems
IEEE Transactions on Parallel and Distributed Systems
Abstractions for Portable, Scalable Parallel Programming
IEEE Transactions on Parallel and Distributed Systems
Communication-optimal parallel minimum spanning tree algorithms (extended abstract)
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
“Dynamic-fault-prone BSP”: a paradigm for robust computations in changing environments
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract)
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Computational bounds for fundamental problems on general-purpose parallel models
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Optimizing and load balancing metacomputing applications
ICS '98 Proceedings of the 12th international conference on Supercomputing
Load balanced parallel radix sort
ICS '98 Proceedings of the 12th international conference on Supercomputing
Support for Efficient Programming on the SB-PRAM
International Journal of Parallel Programming
Models and languages for parallel computation
ACM Computing Surveys (CSUR)
A quantitative comparison of parallel computation models
ACM Transactions on Computer Systems (TOCS)
A new deterministic parallel sorting algorithm with an experimental evaluation
Journal of Experimental Algorithmics (JEA)
A Gracefully Degrading Massively Parallel System Using the BSP Model, and Its Evaluation
IEEE Transactions on Computers
Computing the Medial Axis Transform in Parallel With Eight Scan Operations
IEEE Transactions on Pattern Analysis and Machine Intelligence
Resource Scaling Effects on MPP Performance: The STAP Benchmark Implications
IEEE Transactions on Parallel and Distributed Systems
Communication-processor tradeoffs in limited resources PRAM
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
BOS is boss: a case for bulk-synchronous object systems
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Data management in networks: experimental evaluation of a provably good strategy
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
The Journal of Supercomputing
Preemptive scheduling of parallel jobs on multiprocessors
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Randomized fully-scalable BSP techniques for multi-searching and convex hull construction
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
The QRQW PRAM: accounting for contention in parallel algorithms
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Emulations between QSM, BSP, and LogP: a framework for general-purpose parallel algorithm design
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Portable and Efficient Parallel Computing Using the BSP Model
IEEE Transactions on Computers
Optimal Clustering of Tree-Sweep Computations for High-Latency Parallel Environments
IEEE Transactions on Parallel and Distributed Systems
Coarse grained parallel computing on heterogeneous systems
SAC '98 Proceedings of the 1998 ACM symposium on Applied Computing
Lower Bounds on Communication Loads and Optimal Placements in Torus Networks
IEEE Transactions on Computers
Proceedings of the 14th international conference on Supercomputing
Compression using efficient multicasting
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
PADS '00 Proceedings of the fourteenth workshop on Parallel and distributed simulation
A Design Methodology for Data-Parallel Applications
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
A Transformation Approach to Derive Efficient Parallel Implementations
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Scatter and gather operations on an asynchronous communication model
SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
A formal model for the parallel semantics of P3L
SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Some remarks on parallel exponentiation (extended abstract)
ISSAC '00 Proceedings of the 2000 international symposium on Symbolic and algebraic computation
An object-oriented framework for data parallelism
ACM Computing Surveys (CSUR)
Task Allocation on a Network of Processors
IEEE Transactions on Computers
An Optical Bus-Based Distributed Dynamic Barrier Mechanism
IEEE Transactions on Computers
NestStep: Nested Parallelism and Virtual Shared Memory for the BSP Model
The Journal of Supercomputing
H-BSP: A Hierarchical BSP Computation Model
The Journal of Supercomputing
The implementation of MPI-2 one-sided communication for the NEC SX-5
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Finding a Hamiltonian paths in tournaments on clusters - a provably communication-efficient approach
Proceedings of the 2001 ACM symposium on Applied computing
Concurrent threads and optimal parallel minimum spanning trees algorithm
Journal of the ACM (JACM)
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Towards practical deteministic write-all algorithms
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Scalable isosurface visualization of massive datasets on COTS clusters
PVG '01 Proceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics
Factoring a binary polynomial of degree over one million
ACM SIGSAM Bulletin
Grid-enabled parallel divide-and-conquer: theory and practice
Proceedings of the 2002 ACM symposium on Applied computing
Communication overlap in multi-tier parallel algorithms
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Highly portable and efficient implementations of parallel adaptive N-body methods
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Topology preserving dynamic load balancing for parallel molecular simulations
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Minimizing randomness in minimum spanning tree, parallel connectivity, and set maxima algorithms
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal tiling for the RNA base pairing problem
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Parallel dynamic programming for solving the string editing problem on a CGM/BSP
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
IEEE Transactions on Parallel and Distributed Systems
Use of a CORBA/RMI gateway: characterization of communication overhead
WOSP '02 Proceedings of the 3rd international workshop on Software and performance
Predicting the performance of synchronous discrete event simulation systems
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Architectural differences of efficient sequential and parallel computers
Journal of Systems Architecture: the EUROMICRO Journal
Adaptive parallel computing on heterogeneous networks with mpC
Parallel Computing
Performance-steered design of software architectures for embedded multicore systems
Software—Practice & Experience
A Glossary of Parallel Computing Terminology
IEEE Parallel & Distributed Technology: Systems & Technology
Nest: A Nested-Predicate Scheme for Fault Tolerance
IEEE Transactions on Computers
Program Structuring for Effective Parallel Portability
IEEE Transactions on Parallel and Distributed Systems
On Load Balancing for Distributed Multiagent Computing
IEEE Transactions on Parallel and Distributed Systems
DNA electrophoresis studied with the cage model
Journal of Computational Physics
Work-optimal simulation of PRAM models on meshes
Nordic Journal of Computing
Symbolic Performance Modeling of Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
Frequency-adaptive join for shared nothing machines
Progress in computer research
A Design of Parallel R-tree on Cluster of Workstations
DNIS '00 Proceedings of the International Workshop on Databases in Networked Information Systems
Parallel Models and Job Characterization for System Scheduling
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
A Coarse-Grained Parallel Algorithm for Maximal Cliques in Circle Graphs
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Parallel Bridging Models and Their Impact on Algorithm Design
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
On the Effectiveness of D-BSP as a Bridging Model of Parallel Computation
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Parallel Algorithm Design with Coarse-Grained Synchronization
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Skel-BSP: Performance Portability for Skeletal Programming
HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Algorithms for Generic Tools in Parallel Numerical Simulation
HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Simulating Parallel Architectures with BSPlab
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
A Randomized Algorithm for Voronoi Diagram of Line Segments on Coarse-Grained Multiprocessors
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Interoperability of Data Parallel Runtime Libraries
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Parallel 'Go with the Winners' Algorithms in the LogP Model
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
An Out-of-Core Sorting Algorithm for Clusters with Processors at Different Speed
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A 2-D Parallel Convex Hull Algorithm with Optimal Communication Phases
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A BSP Approach to the Scheduling of Tightly-Nested Loops
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Optimizing Parallel Bitonic Sort
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
The Power of SIMDs in Real-Time Scheduling
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Distribution Sweeping on Clustered Machines with Hierarchical Memories
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
d-Dimensional Range Search on Multicomputers
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A Randomized Sorting Algorithm on the BSP model
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Algorithms for SMP-Clusters Dense Matrix-Vector Multiplication
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Buckets Strike Back: Improved Parallel Shortest Paths
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Exploiting Hierarchy in Heterogeneous Environments
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
The Heterogeneous Bulk Synchronous Parallel Model
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Implementing Shared Memory on Clustered Machines
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Hardware Implementation of PRAM and Its Performance Evaluation
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Cost Hierarchies for Abstract Parallel Machines
LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
kappa NUMA: A Model for Clusters of SMP-Machines
PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
A Language for the Complexity Analysis of Parallel Programs
PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
Transparent Parallelisation Through Reuse: Between a Compiler and a Library Approach
ECOOP '93 Proceedings of the 7th European Conference on Object-Oriented Programming
All-Pairs Shortest Paths Computation in the BSP Model
ICALP '01 Proceedings of the 28th International Colloquium on Automata, Languages and Programming,
Seamless Integration of Parallelism and Memory Hierarchy
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Time-Sharing Parallel Jobs in the Presence of Multiple Resource Requirements
IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Improving Parallel Job Scheduling Using Runtime Measurements
IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Average-Case Communication-Optimal Parallel Parenthesis Matching
ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
BSlambdap: Functional BSP Programs on Enumerated Vectors
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
The ParCel-2 Programming Language (Research Note)
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Performance Prediction of an NAS Benchmark Program with ChronosMix Environment
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
On the Predictive Quality of BSP-like Cost Functions for NOWs
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Performance Prediction of Oblivious BSP Programs
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Towards Formally Refining BSP Barrier s into Explicit Two-Sided Communications
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Symbolic Cost Estimation of Parallel Applications
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Non-approximability of the Bulk Synchronous Task Scheduling Problem
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Parallel Convex Hull Computation by Generalised Regular Sampling
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Parallel Algorithms for Grounded Range Search and Applications
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Optimising Skeletal-Stream Parallelism on a BSP Computer
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Condensed Graphs: A Multi-level, Parallel, Intermediate Representation
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
The Skel-BSP Global Optimizer: Enhancing Performance Portability in Parallel Programming
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Parallel Geometric Algorithms in Coarse-Grain Network Models
COCOON '98 Proceedings of the 4th Annual International Conference on Computing and Combinatorics
Computational Power of BSP Computers
SOFSEM '98 Proceedings of the 25th Conference on Current Trends in Theory and Practice of Informatics: Theory and Practice of Informatics
Pipelined Decomposable BSP Computers
SOFSEM '01 Proceedings of the 28th Conference on Current Trends in Theory and Practice of Informatics Piestany: Theory and Practice of Informatics
VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
Measuring the Performance Impact of SP-Restricted Programming in Shared-Memory Machines
VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
BSP Algorithms - Write Once, Run Anywhere
WAE '99 Proceedings of the 3rd International Workshop on Algorithm Engineering
Portable List Ranking: An Experimental Study
WAE '00 Proceedings of the 4th International Workshop on Algorithm Engineering
Graph Coloring on a Coarse Grained Multiprocessor
WG '00 Proceedings of the 26th International Workshop on Graph-Theoretic Concepts in Computer Science
Coarse Grained Parallel Algorithms for Detecting Convex Bipartite Graphs
WG '00 Proceedings of the 26th International Workshop on Graph-Theoretic Concepts in Computer Science
Intensional High Performance Computing
DCW '00 Proceedings of the Third International Workshop on Distributed Communities on the Web
A Fixpoint Theory for Non-monotonic Parallelism
CSL '02 Proceedings of the 16th International Workshop and 11th Annual Conference of the EACSL on Computer Science Logic
How to Write a Healthiness Condition
IFM '00 Proceedings of the Second International Conference on Integrated Formal Methods
Performance and Predictability of MPI and BSP Programs on the CRAY T3E
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Handling Graphs According to a Coarse Grained Approach: Experiments with PVM and MPI
Proceedings of the 7th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Adaptive Execution of Pipelines
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
PVM Computation of the Transitive Closure: The Dependency Graph Approach
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Nested Bulk Synchronous Parallel Computing
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
A pi-calculus Model of a Spanish Fish Market - Preliminary Report
AMET '98 Selected Papers from the First International Workshop on Agent Mediated Electronic Trading on Agent Mediated Electronic Commerce
A Skew-insensitive Algorithm for Join and Multi-join Operations on Shared Nothing Machines
DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Declarative Modelling of Constraint Propagation Strategies
ADVIS '00 Proceedings of the First International Conference on Advances in Information Systems
Decomposable Bulk Synchronous Parallel Computers
SOFSEM '99 Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics
Architecture Independent Analysis of Parallel Programs
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
BSP Performance Analysis and Prediction: Tools and Application
PaCT '999 Proceedings of the 5th International Conference on Parallel Computing Technologies
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
A Cost Model for Asynchronous and Structured Message Passing
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
The Paderborn University BSP (PUB) library
Parallel Computing
List-ranking on interconnection networks
Information and Computation
Parallel ray tracing on a chip
Practical parallel rendering
Portable and architecture independent parallel performance tuning using BSP
Parallel Computing
Randomized permutations in a coarse grained parallel environment
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
1-Optimality of static BSP computations: scheduling independent chains as a case study
Theoretical Computer Science
SCL-chan: An Asynchronous Data-Parallel Language for Irregular Algorithms
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Algorithm Design and Analysis Using the WPRAM Model
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Abstracting network characteristics and locality properties of parallel systems
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Operator Design Pattern for Data Parallel Computation
TOOLS '97 Proceedings of the Tools-23: Technology of Object-Oriented Languages and Systems
Architecture independent parallel selection with applications to parallel priority queues
Theoretical Computer Science
Algorithm engineering for parallel computation
Experimental algorithmics
Gracefully Degrading Systems Using the Bulk-Synchronous Parallel Model with Randomised Shared Memory
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Deterministic Routing of h-relations on the Multibutterfly
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Experimental Validation of Parallel Computation Models on the Intel Paragon
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Predicting the Running Times of Parallel Programs by Simulation
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Efficient Barrier Synchronization Mechanism for the BSP Model on Message-Passing Architectures
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Contention-Aware Communication Schedule for High-Speed Communication
Cluster Computing
Portable list ranking: an experimental study
Journal of Experimental Algorithmics (JEA)
Graph coloring on coarse grained multicomputers
Discrete Applied Mathematics - Special issue: The second international colloquium, "journées de l'informatique messine"
Parallel 'go with the winners' algorithms in distributed memory models
Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
Incorporating memory layout in the modeling of message passing programs
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Parallel, distributed and network-based processing
Parallel implementation of geometric shortest path algorithms
Parallel Computing - Special issue: High performance computing with geographical data
Deterministic computations on a PRAM with static processor and memory faults
Fundamenta Informaticae
Optimal broadcast on parallel locality models
Journal of Discrete Algorithms
A fixpoint theory for non-monotonic parallelism
Theoretical Computer Science
Emulations between QSM, BSP and LogP: a framework for general-purpose parallel algorithm design
Journal of Parallel and Distributed Computing
Solving large FPT problems on coarse-grained parallel machines
Journal of Computer and System Sciences - Special issue on Parameterized computation and complexity
Parallel and Distributed Haskells
Journal of Functional Programming
ACM Transactions on Programming Languages and Systems (TOPLAS)
Architecture independent parallel binomial tree option price valuations
Parallel Computing
Parallelism versus memory allocation in pipelined router forwarding engines
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Predicting the performance of parallel programs
Parallel Computing
Checkpointing-based rollback recovery for parallel applications on the InteGrade grid middleware
MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
Parallel and distributed simulation: managing external workload with BSP time warp
Proceedings of the 34th conference on Winter simulation: exploring new frontiers
Predicting the Performance of Synchronous Discrete Event Simulation
IEEE Transactions on Parallel and Distributed Systems
A Geometric Programming Framework for Optimal Multi-Level Tiling
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Predicting and Evaluating Distributed Communication Performance
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
BCS-MPI: A New Approach in the System Software Design for Large-Scale Parallel Computers
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Coarse grained gather and scatter operations with applications
Journal of Parallel and Distributed Computing
On performance analysis of heterogeneous parallel algorithms
Parallel Computing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 14 - Volume 15
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 14 - Volume 15
Productivity in High Performance Computing
International Journal of High Performance Computing Applications
Journal of Parallel and Distributed Computing
A tight analysis and near-optimal instances of the algorithm of Anderson and Woll
Theoretical Computer Science
A Coarse-Grained multicomputer algorithm for the detection of repetitions
Information Processing Letters
Loci: a rule-based framework for parallel multi-disciplinary simulation synthesis
Journal of Functional Programming
Encyclopedia of Computer Science
The design and implementation of LilyTask in shared memory
ACM SIGOPS Operating Systems Review
On deadlocks of exclusive AND-requests for resources
Distributed Computing
Cluster Computing for Determining Three-Dimensional Protein Structure
The Journal of Supercomputing
A static analysis for bulk synchronous parallel ML to avoid parallel nesting
Future Generation Computer Systems - Special issue: Parallel computing technologies
Adaptive Parallel Job Scheduling with Flexible Coscheduling
IEEE Transactions on Parallel and Distributed Systems
Lifting sequential graph algorithms for distributed-memory parallel computation
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Application Resource Requirement Estimation in a Parallel-Pipeline Model of Execution
IEEE Transactions on Parallel and Distributed Systems
The implementation of the BSP parallel computing model on the InteGrade Grid middleware
MGC '05 Proceedings of the 3rd international workshop on Middleware for grid computing
The MHETA Execution Model for Heterogeneous Clusters
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Towards a more realistic BSP cost model
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Self-Organizing Communication-aware Resource Management for Scheduling in Grid Environment
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
International Journal of Parallel Programming
Exchanging messages of different sizes
Journal of Parallel and Distributed Computing
Communication-optimal parallel parenthesis matching
Parallel Computing
Translating submachine locality into locality of reference
Journal of Parallel and Distributed Computing - Special issue: 18th International parallel and distributed processing symposium
PEMPIs: a new methodology of modeling and prediction of MPI programs performance
International Journal of Parallel Programming
Efficient automatic simulation of parallel computation on networks of workstations
Discrete Applied Mathematics
Modeling instruction placement on a spatial architecture
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
The cache complexity of multithreaded cache oblivious algorithms
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Designing irregular parallel algorithms with mutual exclusion and lock-free protocols
Journal of Parallel and Distributed Computing
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 2
Realizing the e-science desktop peer using a peer-to-peer distributed virtual machine middleware
Proceedings of the 4th international workshop on Middleware for grid computing
A Parallel Computational Model for Heterogeneous Clusters
IEEE Transactions on Parallel and Distributed Systems
WOSP '07 Proceedings of the 6th international workshop on Software and performance
Linear work suffix array construction
Journal of the ACM (JACM)
Stochastic modeling and analysis of hybrid mobility in reconfigurable distributed virtual machines
Journal of Parallel and Distributed Computing
Parallel Computing - Algorithmic skeletons
Computer Languages, Systems and Structures
A bulk-synchronous parallel process algebra
Computer Languages, Systems and Structures
Communication-efficient parallel generic pairwise elimination
Future Generation Computer Systems - Special section: Information engineering and enterprise architecture in distributed computing environments
Coprocessor design to support MPI primitives in configurable multiprocessors
Integration, the VLSI Journal
Performance engineering, PSEs and the GRID
Scientific Programming
A tool for performance modeling of parallel programs
Scientific Programming
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Modeling contention of giga-updates per second (GUPs) in three parallel programming paradigms
PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
Parallel Scripting with Python
Computing in Science and Engineering
Load balancing distributed inverted files
Proceedings of the 9th annual ACM international workshop on Web information and data management
High-performance distributed inverted files
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Relation-based computations in a monadic BSP model
Parallel Computing
PRO: a model for the design and analysis of efficient and scalable parallel algorithms
Nordic Journal of Computing
Coarse grained parallel algorithms for graph matching
Parallel Computing
Optimistic parallelism benefits from data partitioning
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Efficient sampling of random permutations
Journal of Discrete Algorithms
A framework for scalable greedy coloring on distributed-memory parallel computers
Journal of Parallel and Distributed Computing
HPM: a hierarchical model for parallel computations
International Journal of High Performance Computing and Networking
Distributed computing and the multicore revolution
ACM SIGACT News
BSGP: bulk-synchronous GPU programming
ACM SIGGRAPH 2008 papers
A regression-based approach to scalability prediction
Proceedings of the 22nd annual international conference on Supercomputing
Optimal speedup on a low-degree multi-core parallel architecture (LoPRAM)
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Fundamental parallel algorithms for private-cache chip multiprocessors
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Cache-efficient dynamic programming algorithms for multicores
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
A framework for adaptive collective communications for heterogeneous hierarchical computing systems
Journal of Computer and System Sciences
Decomposing Verification Around End-User Features
Verified Software: Theories, Tools, Experiments
Searching and Updating Metric Space Databases Using the Parallel EGNAT
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Efficient Parallel Tree Reductions on Distributed Memory Environments
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part II
Resource Load Balancing Based on Multi-agent in ServiceBSP Model
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Exploiting Hybrid Parallelism in Web Search Engines
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Scheduling Intersection Queries in Term Partitioned Inverted Files
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A PGAS-Based Algorithm for the Longest Common Subsequence Problem
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A Search Engine Index for Multimedia Content
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A Bridging Model for Multi-core Computing
ESA '08 Proceedings of the 16th annual European symposium on Algorithms
Parallel methods for absolute irreducibility testing
The Journal of Supercomputing
High-performance priority queues for parallel crawlers
Proceedings of the 10th ACM workshop on Web information and data management
RAT: RC Amenability Test for Rapid Performance Prediction
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A unified model for multicore architectures
IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
Scalable isosurface visualization of massive datasets on commodity off-the-shelf clusters
Journal of Parallel and Distributed Computing
Parallel query processing on distributed clustering indexes
Journal of Discrete Algorithms
Mapping parallelism to multi-cores: a machine learning based approach
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Boolean circuit programming: A new paradigm to design parallel algorithms
Journal of Discrete Algorithms
Compile-Time and Run-Time Issues in an Auto-Parallelisation System for the Cell BE Processor
Euro-Par 2008 Workshops - Parallel Processing
Proceedings of the forty-first annual ACM symposium on Theory of computing
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Rigel: an architecture and scalable programming interface for a 1000-core accelerator
Proceedings of the 36th annual international symposium on Computer architecture
The Journal of Supercomputing
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
psort, Yet Another Fast Stable Sorting Software
SEA '09 Proceedings of the 8th International Symposium on Experimental Algorithms
Bsp2omp: A Compiler For Translating Bsp Programs To Openmp
International Journal of Parallel, Emergent and Distributed Systems - Advances in Parallel and Distributed Computational Models
Speeding up genetic programming: a parallel BSP implementation
GECCO '96 Proceedings of the 1st annual conference on Genetic and evolutionary computation
Configurable emulated shared memory architecture for general purpose MP-SOCs and NOC regions
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
OSL: Optimized Bulk Synchronous Parallel Skeletons on Distributed Arrays
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Evaluating multicore algorithms on the unified memory model
Scientific Programming - Software Development for Multi-core Computing Systems
Mobile processes, mobile channels and complex dynamic systems
CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Scalable communication protocols for dynamic sparse data exchange
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Graph coloring on coarse grained multicomputers
Discrete Applied Mathematics
Efficient automatic simulation of parallel computation on networks of workstations
Discrete Applied Mathematics
A Generic Cost Model for Concurrent and Data-parallel Meta-computing
Electronic Notes in Theoretical Computer Science (ENTCS)
A static analysis for Bulk Synchronous Parallel ML to avoid parallel nesting
Future Generation Computer Systems - Special issue: Parallel computing technologies
A Coarse-Grained Multicomputer algorithm for the detection of repetitions
Information Processing Letters
Santa Claus: Formal analysis of a process-oriented solution
ACM Transactions on Programming Languages and Systems (TOPLAS)
Cortical architectures on a GPGPU
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
A model for estimating the performance of synchronous parallel network simulation
International Journal of Modelling and Simulation
Application execution management on the InteGrade opportunistic grid middleware
Journal of Parallel and Distributed Computing
Simple linear work suffix array construction
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Anahy: a programming environment for cluster computing
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartII
A parallel wavefront algorithm for efficient biological sequence comparison
ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartII
Towards realistic implementations of external memory algorithms using a coarse grained paradigm
ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartII
Parallel superposition for bulk synchronous parallel ML
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
A parallel virtual machine for bulk synchronous parallel ML
ICCS'03 Proceedings of the 1st international conference on Computational science: PartI
A parallel programming environment on grid
ICCS'03 Proceedings of the 1st international conference on Computational science: PartI
Mapping unstructured applications into nested parallelism
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
NestStepModelica: mathematical modeling and bulk-synchronous parallel simulation
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
The parXXL environment: scalable fine grained development for large coarse grained platforms
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Experiments with a parallel external memory system
HiPC'07 Proceedings of the 14th international conference on High performance computing
A parallel BSP algorithm for irregular dynamic programming
APPT'07 Proceedings of the 7th international conference on Advanced parallel processing technologies
Modeling multigrain parallelism on heterogeneous multi-core processors: a case study of the cell BE
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Divide-and-conquer parallel programming with minimally synchronous parallel ML
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Low depth cache-oblivious algorithms
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
New algorithms for efficient parallel string comparison
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Cohesion: a hybrid memory model for accelerators
Proceedings of the 37th annual international symposium on Computer architecture
Efficient partial-duplicate detection based on sequence matching
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Methodology for Efficient Execution of SPMD Applications on Multicore Environments
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
mPlogP: A Parallel Computation Model for Heterogeneous Multi-core Computer
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
New caching techniques for web search engines
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Parallel processing of data from very large-scale wireless sensor networks
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Estimating parallel performance, a skeleton-based approach
Proceedings of the fourth international workshop on High-level parallel programming and applications
Parallel greedy graph matching using an edge partitioning approach
Proceedings of the fourth international workshop on High-level parallel programming and applications
Hybrid bulk synchronous parallelism library for clustered smp architectures
Proceedings of the fourth international workshop on High-level parallel programming and applications
Building efficient multi-threaded search nodes
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A model of computation for MapReduce
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Revisiting Cramer's rule for solving dense linear systems
SpringSim '10 Proceedings of the 2010 Spring Simulation Multiconference
Algorithms and theory of computation handbook
Parallel computation: models and complexity issues
Algorithms and theory of computation handbook
Parallel longest increasing subsequences in scalable time and memory
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Fast PGAS Implementation of Distributed Graph Algorithms
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Parallel selection by regular sampling
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
A bridging model for multi-core computing
Journal of Computer and System Sciences
A middleware for parallel processing of large graphs
Proceedings of the 8th International Workshop on Middleware for Grids, Clouds and e-Science
Energy considerations for divisible load processing
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part II
Model oriented profiling of parallel programs
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Predictability of bulk synchronous programs using MPI
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Specification for reactive bulk-synchronous programming
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Groups in bulk synchronous parallel computing
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
The parallel cellular programming model
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Towards efficient BSP implementations of BSR programs for some computational geometry problems
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Cache-oblivious simulation of parallel programs
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Towards a parallel framework of grid-based numerical algorithms on DAGs
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
LogfP - a model for small messages in InfiniBand
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Cost evaluation from specifications for BSP programs
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Support for adaptivity in ARMCI using migratable objects
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Algorithm engineering: bridging the gap between algorithm theory and practice
Algorithm engineering: bridging the gap between algorithm theory and practice
Piccolo: building fast, distributed programs with partitioned tables
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
3D block-based medial axis transform and chessboard distance transform based on dominance
Image and Vision Computing
Throughput-Effective On-Chip Networks for Manycore Accelerators
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Light-weight communications on Intel's single-chip cloud computer processor
ACM SIGOPS Operating Systems Review
A high-level framework for distributed processing of large-scale graphs
ICDCN'11 Proceedings of the 12th international conference on Distributed computing and networking
Kanor: a declarative language for explicit communication
PADL'11 Proceedings of the 13th international conference on Practical aspects of declarative languages
Applying process migration on a BSP-based LU decomposition application
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
A framework for parallel genetic algorithms on PC cluster
IMCAS'06 Proceedings of the 5th WSEAS international conference on Instrumentation, measurement, circuits and systems
Parallel evaluation of conjunctive queries
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Brief announcement: large-scale multimaps
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Scheduling irregular parallel computations on hierarchical caches
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
The round complexity of distributed sorting: extended abstract
Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Active pebbles: parallel programming for data-driven applications
Proceedings of the international conference on Supercomputing
An analytical model for multilevel performance prediction of Multi-FPGA systems
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Balance principles for algorithm-architecture co-design
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
HipG: parallel processing of large-scale graphs
ACM SIGOPS Operating Systems Review
Spatial hardware implementation for sparse graph algorithms in GraphStep
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Performance modeling for multilevel communication in SHMEM+
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Petri-nets as an intermediate representation for heterogeneous architectures
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Kernel-based offload of collective operations: implementation, evaluation and lessons learned
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Cache size in a cost model for heterogeneous skeletons
Proceedings of the fifth international workshop on High-level parallel programming and applications
Type system for a safe execution of parallel programs in BSML
Proceedings of the fifth international workshop on High-level parallel programming and applications
Proceedings of the 2nd ACM Symposium on Cloud Computing
Making time-stepped applications tick in the cloud
Proceedings of the 2nd ACM Symposium on Cloud Computing
A formal programming model of Orléans skeleton library
PaCT'11 Proceedings of the 11th international conference on Parallel computing technologies
Oracle scheduling: controlling granularity in implicitly parallel languages
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
A framework for an automatic hybrid MPI+OpenMP code generation
Proceedings of the 19th High Performance Computing Symposia
The Combinatorial BLAS: design, implementation, and applications
International Journal of High Performance Computing Applications
ParallelGDB: a parallel graph database based on cache specialization
Proceedings of the 15th Symposium on International Database Engineering & Applications
Optimizing explicit data transfers for data parallel applications on the cell architecture
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Paging for multi-core shared caches
Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
Bounded arboricity to determine the local structure of sparse graphs
WG'06 Proceedings of the 32nd international conference on Graph-Theoretic Concepts in Computer Science
Bulk synchronous parallel ML: semantics and implementation of the parallel juxtaposition
CSR'06 Proceedings of the First international computer science conference on Theory and Applications
Total exchange performance modelling under network contention
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
A web computing environment for parallel algorithms in java
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Load balancing strategies in a web computing environment
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Multi-DaC programming model: a variant of multi-BSP model for divide-and-conquer algorithms
DAMP '12 Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming
GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
ServiceBSP model with qos considerations in grids
APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
An index data structure for searching in metric space databases
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
A CGM algorithm solving the longest increasing subsequence problem
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Efficient longest common subsequence computation using bulk-synchronous parallelism
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
A BSP/CGM algorithm for finding all maximal contiguous subsequences of a sequence of numbers
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
A preliminary nested-parallel framework to efficiently implement scientific applications
VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
SPC-XML: a structured representation for nested-parallel programming languages
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Towards a bulk-synchronous distributed shared memory programming environment for grids
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Green-Marl: a DSL for easy and efficient graph analysis
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Fast concurrency control for distributed inverted files
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Efficient parallelization of spatial approximation trees
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Dynamic memory management in the loci framework
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Bulk synchronous parallel ML: modular implementation and performance prediction
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Modeling execution time of selected computation and communication kernels on grids
EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
SIMD re-convergence at thread frontiers
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Generating c code from LOGS specifications
ICTAC'05 Proceedings of the Second international conference on Theoretical Aspects of Computing
Canal: scaling social network-based Sybil tolerance schemes
Proceedings of the 7th ACM european conference on Computer Systems
SGL: towards a bridging model for heterogeneous hierarchical platforms
International Journal of High Performance Computing and Networking
Foundations and Trends® in Machine Learning
Continuous monitoring in the dynamic sensor field model
ALGOSENSORS'11 Proceedings of the 7th international conference on Algorithms for Sensor Systems, Wireless Ad Hoc Networks and Autonomous Mobile Entities
Palovca: describing and executing graph algorithms in haskell
PADL'12 Proceedings of the 14th international conference on Practical Aspects of Declarative Languages
Sorting, searching, and simulation in the mapreduce framework
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
The efficiency of mapreduce in parallel external memory
LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics
Diderot: a parallel DSL for image analysis and visualization
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
An object-oriented bulk synchronous parallel library for multicore programming
Concurrency and Computation: Practice & Experience
Survey: Computational models for networks of tiny artifacts: A survey
Computer Science Review
Space-round tradeoffs for MapReduce computations
Proceedings of the 26th ACM international conference on Supercomputing
Load Balancing Query Processing in Metric-Space Similarity Search
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Communication-optimal parallel algorithm for strassen's matrix multiplication
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
A coarse-grained parallel algorithm for the matrix chain order problem
Proceedings of the 2012 Symposium on High Performance Computing
BC-PDM: data mining, social network analysis and text mining system based on cloud computing
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Managing large graphs on multi-cores with graph awareness
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
A novel parallel algorithm for gaussian elimination of sparse unsymmetric matrices
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
An experimental comparison of load balancing strategies in a web computing environment
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Verification of a heat diffusion simulation written with orléans skeleton library
PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Counter automata for parameterised timing analysis of box-based systems
FOPARA'11 Proceedings of the Second international conference on Foundational and Practical Aspects of Resource Analysis
A yoke of oxen and a thousand chickens for heavy lifting graph processing
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Deterministic Computations on a PRAM with Static Processor and Memory Faults
Fundamenta Informaticae
A black-box approach to understanding concurrency in DaCapo
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
To boldly go: an occam-π mission to engineer emergence
Natural Computing: an international journal
Through the concurrency gateway: a challenge from the near future of graphics hardware
EG PGV'04 Proceedings of the 5th Eurographics conference on Parallel Graphics and Visualization
GraphChi: large-scale graph computation on just a PC
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Breaking the speed and scalability barriers for graph exploration on distributed-memory machines
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Bamboo: translating MPI applications to a latency-tolerant, data-driven form
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Aspen: a domain specific language for performance modeling
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Optimization principles for collective neighborhood communications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Facilitating real-time graph mining
Proceedings of the fourth international workshop on Cloud data management
A scheduling toolkit for multiprocessor-task programming with dependencies
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Load balancing on an interactive multiplayer game server
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
A search engine accepting on-line updates
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Techniques for designing efficient parallel graph algorithms for SMPs and multicore processors
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Self--consistent MPI performance requirements
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
(Sync|Async)+ MPI search engines
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Continuous monitoring in the dynamic sensor field model
Theoretical Computer Science
Concurrency and Computation: Practice & Experience
Towards a complexity model for design and analysis of PGAS-based algorithms
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Job scheduling using successive linear programming approximations of a sparse model
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
A lower bound technique for communication on BSP with application to the FFT
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
A verified library of algorithmic skeletons on evenly distributed arrays
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Using Pregel-like Large Scale Graph Processing Frameworks for Social Network Analysis
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Iterative parallel data processing with stratosphere: an inside look
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Cumulon: optimizing statistical data analysis in the cloud
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Mizan: a system for dynamic load balancing in large-scale graph processing
Proceedings of the 8th ACM European Conference on Computer Systems
Presto: distributed machine learning and graph processing with sparse matrices
Proceedings of the 8th ACM European Conference on Computer Systems
Bandwidth-optimal all-to-all exchanges in fat tree networks
Proceedings of the 27th international ACM conference on International conference on supercomputing
Expressing graph algorithms using generalized active messages
Proceedings of the 27th international ACM conference on International conference on supercomputing
Proceedings of the 16th International ACM Sigsoft symposium on Component-based software engineering
A divide and conquer approach and a work-optimal parallel algorithm for the LIS problem
Information Processing Letters
Early experiences in using a domain-specific language for large-scale graph analysis
First International Workshop on Graph Data Management Experiences and Systems
GPS: a graph processing system
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Approximate parallel simulation of web search engines
Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Fast greedy algorithms in mapreduce and streaming
Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
G-path: flexible path pattern query on large graphs
Proceedings of the 22nd international conference on World Wide Web companion
WTF: the who to follow service at Twitter
Proceedings of the 22nd international conference on World Wide Web
Estimating parallel performance
Journal of Parallel and Distributed Computing
The von Neumann architecture is due for retirement
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Distributed data management using MapReduce
ACM Computing Surveys (CSUR)
An improved parallel singular value algorithm and its implementation for multicore hardware
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
PAGE: a partition aware graph computation engine
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
"All roads lead to Rome": optimistic recovery for distributed iterative data processing
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Modeling synthetic aperture radar computation with Aspen
International Journal of High Performance Computing Applications
Designing on-chip networks for throughput accelerators
ACM Transactions on Architecture and Code Optimization (TACO)
Journal of Systems and Software
Analysis of partitioning strategies for graph processing in bulk synchronous parallel models
Proceedings of the fifth international workshop on Cloud data management
Proceedings of the 21st International conference on Real-Time Networks and Systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
A lightweight infrastructure for graph analytics
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
The family of mapreduce and large-scale data processing systems
ACM Computing Surveys (CSUR)
DANBI: dynamic scheduling of irregular stream programs for many-core systems
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Extending modern PaaS clouds with BSP to execute legacy MPI applications
Proceedings of the 4th annual Symposium on Cloud Computing
Programming with BSP homomorphisms
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Giraphx: parallel yet serializable large-scale graph processing
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
BDMPI: conquering BigData with small clusters using MPI
DISCS-2013 Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems
The energy case for graph processing on hybrid CPU and GPU systems
IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Making queries tractable on big data with preprocessing: through the eyes of complexity theory
Proceedings of the VLDB Endowment
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Efficient query evaluation on distributed graphs with Hadoop environment
Proceedings of the Fourth Symposium on Information and Communication Technology
Simplifying Scalable Graph Processing with a Domain-Specific Language
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Red Fox: An Execution Environment for Relational Query Processing on GPUs
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
PREDIcT: towards predicting the runtime of large scale iterative analytics
Proceedings of the VLDB Endowment
Distributed socialite: a datalog-based language for large-scale graph analysis
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Fast iterative graph computation with block updates
Proceedings of the VLDB Endowment
Compiling Fresh Breeze Codelets
Proceedings of Programming Models and Applications on Multicores and Manycores
Efficient Parallel Implementations of Multiple Sequence Alignment using BSP/CGM Model
Proceedings of Programming Models and Applications on Multicores and Manycores
A memory access model for highly-threaded many-core architectures
Future Generation Computer Systems
Apple-CORE: Harnessing general-purpose many-cores with hardware concurrency management
Microprocessors & Microsystems
Parallel processing of large graphs
Future Generation Computer Systems
Minimizing synchronizations in sparse iterative solvers for distributed supercomputers
Computers & Mathematics with Applications
Measurement of the latency parameters of the Multi-BSP model: a multicore benchmarking approach
The Journal of Supercomputing
Exploiting inter-operation parallelism for matrix chain multiplication using MapReduce
The Journal of Supercomputing
Is multicore hardware for general-purpose parallel processing broken?
Communications of the ACM
NEWT - A Fault Tolerant BSP Framework on Hadoop YARN
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
A Randomized Parallel Three-Dimensional Convex Hull Algorithm for Coarse-Grained Multicomputers
Theory of Computing Systems
Modelling Search Engines Performance Using Coloured Petri Nets
Fundamenta Informaticae - Application and Theory of Petri Nets and Concurrency, 2012
Hi-index | 48.30 |
The success of the von Neumann model of sequential computation is attributable to the fact that it is an efficient bridge between software and hardware: high-level languages can be efficiently compiled on to this model; yet it can be effeciently implemented in hardware. The author argues that an analogous bridge between software and hardware in required for parallel computation if that is to become as widely used. This article introduces the bulk-synchronous parallel (BSP) model as a candidate for this role, and gives results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.