Hitting the memory wall: implications of the obvious
ACM SIGARCH Computer Architecture News
Scalable parallel coset enumeration: bulk definition and the memory wall
Journal of Symbolic Computation - Computer algebra: Selected papers from ISSAC 2001
Profile-guided I/O partitioning
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Memory-based and disk-based algorithms for very high degree permutation groups
ISSAC '03 Proceedings of the 2003 international symposium on Symbolic and algebraic computation
Fast multiplication of large permutations for disk, flash memory and RAM
Proceedings of the 2010 International Symposium on Symbolic and Algebraic Computation
Hi-index | 0.00 |
The traditional permutation multiplication algorithm is now limited by memory latency and not by CPU speed. A new cache-aware permutation algorithm speeds up permutation multiplication by a factor of 3.4 on current CPUs. The new algorithm is limited by memory bandwidth, but not by memory latency. Current trends indicate improving memory bandwidth and stagnant memory latency. This makes the new algorithm especially important for future computer architectures. In addition, we believe this "memory wall" will soon force a redesign of other common algorithms of symbolic algebra.