Papers
Topics
Authors
Recent
Search
2000 character limit reached

Role-Oriented Code Generation in an Engine for Solving Hyperbolic PDE Systems

Published 15 Nov 2019 in cs.MS and cs.SE | (1911.06817v2)

Abstract: The development of a high performance PDE solver requires the combined expertise of interdisciplinary teams with respect to application domain, numerical scheme and low-level optimization. In this paper, we present how the ExaHyPE engine facilitates the collaboration of such teams by isolating three roles: application, algorithms, and optimization expert. We thus support team members in letting them focus on their own area of expertise while integrating their contributions into an HPC production code. Inspired by web application development practices, ExaHyPE relies on two custom code generation modules, the Toolkit and the Kernel Generator, which follow a Model-View-Controller architectural pattern on top of the Jinja2 template engine library. Using Jinja2's templates to abstract the critical components of the engine and generated glue code, we isolate the application development from the engine. The template language also allows us to define and use custom template macros that isolate low-level optimizations from the numerical scheme described in the templates. We present three use cases, each focusing on one of our user roles, showcasing how the design of the code generation modules allows to easily expand the solver schemes to support novel demands from applications, to add optimized algorithmic schemes (with reduced memory footprint, e.g.), or provide improved low-level SIMD vectorization support.

Citations (4)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Glossary

  • Adaptive mesh refinement: Technique to dynamically adjust grid resolution in different regions of the computational domain to capture features efficiently. "allowing for dynamic adaptive mesh refinement."
  • ADER-DG: A high-order Arbitrary DERivative Discontinuous Galerkin scheme combining ADER time integration with DG spatial discretization for hyperbolic PDEs. "the ADER-DG scheme"
  • Array of Structures (AoS): Memory layout where all fields of a data record are stored contiguously, which can hinder SIMD across multiple records. "This Array of Structure (AoS) data layout"
  • Auto-vectorization: Compiler optimization that automatically converts scalar operations into SIMD instructions when feasible. "compiler auto-vectorization may be enabled."
  • AVX2: Advanced Vector Extensions 2, a SIMD instruction set extension on x86 CPUs enabling 256-bit vector operations. "for more efficient AVX2 operations."
  • Cache misses: Performance events when requested data is not found in a CPU cache level, causing slower memory access. "significant loss of performances due to cache misses"
  • Cauchy–Kowalewski procedure: Analytic technique to replace time derivatives using spatial derivatives for linear PDEs in ADER methods. "the Cauchy-Kowalewski procedure."
  • CCZ4: A formulation of Einstein’s field equations used in numerical relativity, here as a complex PDE system benchmark. "the Einstein equations from relativistic astrophysics (CCZ4)"
  • CFL number: Courant–Friedrichs–Lewy stability parameter that restricts the timestep relative to mesh spacing and wave speeds. "The next time step size depends on the CFL number."
  • Discontinuous Galerkin (DG): Finite element method using element-local polynomial approximations with discontinuities at element interfaces. "ADER Discontinuous Galerkin (DG) method"
  • Domain Specific Language (DSL): Specialized input language for configuring PDE problems and solver options. "a Domain Specific Language (DSL) defined via JSON Schema"
  • Exascale: Computing systems capable of at least 1018 operations per second, posing unique optimization challenges. "upcoming exascale architectures."
  • Finite Volume (FV) scheme: Discretization that conserves fluxes through control-volume faces; robust for shocks. "shock capturing FV scheme"
  • Godunov-type: Family of finite volume methods using Riemann solvers at cell interfaces for flux evaluation. "Godunov-type FV schemes."
  • Hyperbolic PDE: Class of partial differential equations governing wave-like phenomena with finite propagation speeds. "hyperbolic systems of partial differential equations (PDEs)."
  • Jinja2: Templating engine used to generate C++ code (kernels and glue) from parameterized templates. "Jinja2 template engine library."
  • JSON Schema: Formal specification for validating JSON documents; used to define the solver’s input DSL. "defined via JSON Schema"
  • Kernel (HPC): A performance-critical, self-contained computational routine implementing a numerical substep. "These kernels are the critical parts of ExaHyPE"
  • LIBXSMM: Library generating highly optimized small-matrix multiplication kernels tailored to CPU microarchitectures. "For these we employ LIBXSMM"
  • Model–View–Controller (MVC): Architectural pattern separating data (Model), generation logic (Controller), and code templates (View). "Model-View-Controller (MVC) architectural pattern"
  • MESA-PD: Particle dynamics code employing template-based code generation within the waLBerla framework. "the MESA-PD particle dynamics code"
  • Navier–Stokes equations: Governing equations for fluid dynamics modeling conservation of mass, momentum, and energy (compressible form here). "the compressible Navier-Stokes equations"
  • Non-conservative fluxes: Terms in PDE systems that cannot be written as the divergence of a flux vector and require special treatment. "non-conservative fluxes"
  • Peano framework: AMR mesh infrastructure based on tree-structured Cartesian grids used for ExaHyPE. "using the Peano framework"
  • Picard iterations: Fixed-point iteration method to solve nonlinear systems arising in element-local predictor steps. "calculated using Picard iterations"
  • Relaxed discrete maximum principle: Numerical admissibility criterion ensuring boundedness of polynomial solutions within a tolerance. "a relaxed discrete maximum principle in the sense of polynomials"
  • Riemann problem: Initial-value problem with a piecewise-constant discontinuity used to compute inter-cell numerical fluxes. "to evaluate the Riemann problem on element faces."
  • Riemann solver: Numerical method for resolving fluxes at discontinuities between cells based on local wave propagation. "We introduce a classical Riemann solver"
  • SIMD: Single Instruction, Multiple Data; parallel execution model applying one instruction to multiple data elements. "low-level SIMD vectorization support."
  • Shock capturing: Numerical techniques to stably resolve shock waves without spurious oscillations. "high resolution shock capturing FV scheme"
  • Space-time predictor: Element-local ADER step computing a space–time polynomial approximation before interface corrections. "the space-time predictor."
  • Structure of Array (SoA): Memory layout storing each field across all records contiguously, improving SIMD across records. "Structure of Array (SoA) layout"
  • Tensor operations: Multidimensional linear algebra operations (e.g., contractions) arising from DG discretizations. "element-local tensor operations."
  • UFL: Unified Form Language, a DSL for variational forms often paired with automated code generation. "such as UFL"
  • Viscous flux: Flux contributions dependent on gradients (e.g., stress and heat conduction) in Navier–Stokes systems. "viscous flux terms"
  • waLBerla framework: HPC software framework supporting modular, performance-portable simulation components. "waLBerla framework"

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.