Enabling rapid development of parallel tree search applications

Authors:
Jeffrey P. Gardner;Andrew Connolly;Cameron McBride
Affiliations:
Pittsburgh Supercomputing Center, Pittsburgh, PA;University of Washington, Seattle, WA;University of Pittsburgh, Pittsburgh, PA
Venue:
Proceedings of the 5th IEEE workshop on Challenges of large applications in distributed environments
Year:
2007

Citing 8
Cited 1

Paralex: an environment for parallel programming in distributed systems

ICS '92 Proceedings of the 6th international conference on Supercomputing
CHARM++: a portable concurrent object oriented system based on C++

OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
Extending high performance Fortran for the support of unstructured computations

ICS '95 Proceedings of the 9th international conference on Supercomputing
Co-array Fortran for parallel programming

ACM SIGPLAN Fortran Forum
Supporting dynamic parallel object arrays

Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
ARMI: an adaptive, platform independent communication library

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Co-arrays in the next Fortran Standard

ACM SIGPLAN Fortran Forum
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit

International Journal of High Performance Computing Applications

Scalable clustering algorithm for N-body simulations in a shared-nothing cluster

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management

Quantified Score

Hi-index	0.01

Visualization

Abstract

Virtual observatories will give astronomers easy access to anunprecedented amount of data. Extracting scientific knowledge from these data will increasingly demand both efficient algorithms as well as the power of parallel computers. Nearly all efficient analyses of large astronomical datasets use trees as their fundamental data structure. Writing efficient tree-based techniques, a task that is time-consuming even on single-processor computers, is exceedingly cumbersome on massively parallel platforms (MPPs). Most applications that run on MPPs are simulation codes, since the expense of developing them is offset by the fact that they will be used for many years by many researchers. In contrast, data analysis codes change far more rapidly, are often unique to individual researchers, and therefore accommodate little reuse. Consequently, the economics of the current high-performance computing development paradigm for MPPs does not favor data analysis applications. We have therefore built a library, called Ntropy, that provides a flexible, extensible, and easy-to-use way of developing tree-based data analysis algorithms for both serial and parallel platforms. Our experience has shown that not only does our library save development time, it can also deliver excellent serial performance and parallel scalability. Furthermore, Ntropy makes it easy for an astronomer with little or noparallel programming experience to quickly scale their application to a distributed multiprocessor environment. By minimizing development time for efficient and scalable data analysis, we enable wide-scale knowledge discovery on massive datasets.