Scalable data parallel implementations of object recognition using geometric hashing
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Processor Mapping Techniques Toward Efficient Data Redistribution
IEEE Transactions on Parallel and Distributed Systems
A Generalized Processor Mapping Technique for Array Redistribution
IEEE Transactions on Parallel and Distributed Systems
Introduction to Algorithms
Broadcast scheduling optimization for heterogeneous cluster systems
Journal of Algorithms
LLM: A Low Latency Messaging Infrastructure for Linux Clusters
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Exploting communication Latency Hiding for Parallel Network
Proceedings of the 1994 International Conference on Parallel and Distributed Systems
Many-to-many personalized communication with bounded traffic
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Efficient collective communication in distributed heterogeneous systems
Journal of Parallel and Distributed Computing
Strategies for Achieving High Performance Incremental Computing on a Network Environment
AINA '04 Proceedings of the 18th International Conference on Advanced Information Networking and Applications - Volume 2
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
Bounds on the Client-Server Incremental Computing
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Irregular redistribution scheduling by partitioning messages
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 0.00 |
Incremental computing masks the communication latency by overlapping computations with communications. However, a sequence of messages with a large latency variance still makes computations proceed intermittently. It is known that a dominant input stream from a data server maximizes the CPU utilization of the networked computation server [7]. Unfortunately, the problem of finding a dominant input stream is NP-hard in the strong sense. In this paper, a dominant input stream for LU decomposition is proposed. It is shown that the dominant input stream outperforms the input stream sending data in traditional order. In addition, the nonexistence of dominant input streams is proved for the case that the compressed format is used for sending input data.