Custom Memory Placement for Parallel Data Mining

  • Authors:
  • Srinivasan Parthasarathy;Mohammed J Zaki;Wei Li

  • Affiliations:
  • -;-;-

  • Venue:
  • Custom Memory Placement for Parallel Data Mining
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

A lot of data mining tasks, such as Associations, Sequences, and Classification, use complex pointer-based data structures that typically suffer from sub-optimal locality. In the multi-processor case shared access to these data structures may also result in false-sharing. Most of the optimization techniques for enhancing locality and reducing false sharing have been proposed in the context of numeric applications involving array-based data structures, and are not applicable for dynamic data structures due to dynamic memory allocation from the heap with arbitrary addresses. .pp Within the context of data mining it is commonly observed that the building phase of these large recursive data structures, such as hash trees and decision trees, is random and independent from the access phase which is usually ordered and typically dominates the computation time. In such cases locality and false sharing sensitive memory placement of these structures can enhance performance significantly. We evaluate a set of placement policies over a representive data mining application (association rule discovery) and show that simple placement schemes can improve execution time by more than a factor of two. More complex schemes yield an additional 5-20% gain.