Automated Algorithmic Error Resilience for Structured Grid Problems Based on Outlier Detection

  • Authors:
  • Amoghavarsha Suresh;John Sartori

  • Affiliations:
  • University of Minnesota;University of Minnesota

  • Venue:
  • Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose automated algorithmic error resilience based on outlier detection. Our approach exploits the characteristic behavior of a class of applications to create metric functions that normally produce metric values according to a designed distribution or behavior and produce outlier values (i.e., values that do not conform to the designed distribution or behavior) when computations are affected by errors. For a robust algorithm that employs such an approach, error detection becomes equivalent to outlier detection. As such, we can make use of well-established, statistically rigorous techniques for outlier detection to effectively and efficiently detect errors, and subsequently correct them. Our error-resilient algorithms incur significantly lower overhead than traditional hardware and software error resilience techniques. Also, compared to previous approaches to application-based error resilience, our approaches parameterize the robustification process, making it easy to automatically transform large classes of applications into robust applications with the use of parser-based tools and minimal programmer effort. We demonstrate the use of automated error resilience based on outlier detection for structured grid problems, leveraging the flexibility of algorithmic error resilience to achieve improved application robustness and lower overhead compared to previous error resilience approaches. We demonstrate 2 × --3× improvement in output quality compared to the original algorithm with only 22% overhead, on average, for non-iterative structured grid problems. Average overhead is as low as 4.5% for error-resilient iterative structured grid algorithms that tolerate error rates up to 10E-3 and achieve the same output quality as their error-free counterparts.