Exact and approximate computation of a histogram of pairwise distances between astronomical objects

  • Authors:
  • Bin Fu;Eugene Fink;Garth Gibson;Jaime Carbonell

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA

  • Venue:
  • Proceedings of the 2012 workshop on High-Performance Computing for Astronomy Date
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We compare several alternative approaches to computing correlation functions, which is a cosmological application for analyzing the distribution of matter in the universe. This computation involves counting the pairs of galaxies within a given distance from each other and building a histogram that shows the dependency of the number of pairs on the distance. The straightforward algorithm for counting the exact number of pairs has the O(n2) time complexity, which is unacceptably slow for most astronomical and cosmological datasets, which include billions of objects. We analyze the performance of several alternative algorithms, including the exact computation with an O(n(5/3)) average running time, an approximate computation with linear running time, and another approximate algorithm with sub-linear running time, based on sampling the given dataset and computing the correlation functions for the samples. We compare the accuracy of the described algorithms and analyze the tradeoff between their accuracy and running time. We also propose a novel hybrid approximation algorithm, which outperforms each other technique.