Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Mapping irregular applications to DIVA, a PIM-based data-intensive architecture
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
The architecture of the DIVA processing-in-memory chip
ICS '02 Proceedings of the 16th international conference on Supercomputing
Computational RAM: Implementing Processors in Memory
IEEE Design & Test
Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Superword-Level Parallelism in the Presence of Control Flow
Proceedings of the international symposium on Code generation and optimization
The KOJAK group finder: connecting the dots via integrated knowledge-based and statistical reasoning
IAAI'04 Proceedings of the 16th conference on Innovative applications of artifical intelligence
Model-guided autotuning of high-productivity languages for petascale computing
Proceedings of the 18th ACM international symposium on High performance distributed computing
Fast UDFs to compute sufficient statistics on large data sets exploiting caching and sampling
Data & Knowledge Engineering
Hi-index | 0.00 |
The goal of this work is to gain insight into whether processing-in-memory (PIM) technology can be used to accelerate the performance of link discovery algorithms, which represent an important class of emerging knowledge discovery techniques. PIM chips that integrate processor logic into memory devices offer a new opportunity for bridging the growing gap between processor and memory speeds, especially for applications with high memory-bandwidth requirements. As LD algorithms are data-intensive and highly parallel, involving read-only queries over large data sets, parallel computing power extremely close (physically) to the data has the potential of providing dramatic computing speedups. For this reason, we evaluated the mapping of LD algorithms to a processing-in-memory (PIM) workstation-class architecture, the DIVA/Godiva hardware testbeds developed by USC/ISI. Accounting for differences in clock speed and data scaling, our analysis shows a performance gain on a single PIM, with the potential for greater improvement when multiple PIMs are used. Measured speedups of 8x are shown on two additional bandwidth benchmarks, even though the Itanium-2 has a clock rate 6X faster.