You are here

Tools and Methodologies to Transition DTRA High Fidelity Codes to Leverage Heterogeneous Accelerated Processing Units

Description:

OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Advanced Computing and Software

 

OBJECTIVE: The objective of this project is to develop a performance analysis toolkit that developers can use to transition their codes to leverage Accelerated Processing Units (APUs). The toolkit must be part of a mature performance tools framework, providing other related performance analysis methodologies. A significant portion of the theoretical peak performance of several Exascale systems is attributed to APUs. Preparing DTRA’s High-Fidelity (HF) computer codes to leverage APUs is of high importance. While advancements in technology, such as NVIDIA's Heterogeneous Memory Management (HMM), and Advanced Micro Devices (AMD)’s CDNA2 Coherent Memory Architecture demonstrate the industry's efforts to simplify the offloading of codes to APUs by enabling shared or unified memory access between CPUs and GPUs, there remains a substantial amount of work. Significant code refactoring and optimization efforts will still be required to performantly “map” DTRA’s HF codes to DoD High Performance Computing Modernization Program (HPCMP) systems equipped with APUs. Such efforts can be intelligently guided by workload performance characterization and analysis tools, which inspect the behavior of large-scale HF codes and suggest refactoring and optimization strategies, such as which computational loops could benefit from offloading to the APUs.

 

DESCRIPTION: DTRA uses HF codes on DoD HPCMP systems to investigate weapon effects phenomenology and techniques for countering WMD. End-to-end HF simulations in support of the DTRA projects require calculations including multiple phenomena that occur in vastly different time scales (µs to hours). With DTRA’s growing reliance on HF codes tasks, the efficient use of current computational resources and the strategic planning for upcoming architectures are essential. The existing best practices for offloading large codes to leverage APUs might not be directly applicable to HF codes. Many of these are legacy codes, developed over decades, and may require initial evaluation for modernizing their programming model, such as transitioning from an MPI-Only to a hybrid MPI+OpenMP and MPI+OpenMP+APU-Offload model, or adopting a more portable cross-platform programming framework. As such, collaborating with a DTRA HF code team to understand the requirements for developing the envisioned toolkit is highly encouraged. In addition, approaches that are portable to different APU offerings from different vendors are desired. Offerors must meet all DoD HPCMP user requirements for access to these systems which includes, but is not limited to, possessing a Security Clearance, or having a National Agency Check with Inquiries (NACI).  DTRA will provide an allocation of HPC system resources and assistance in obtaining successful offeror’s user accounts on DoD HPCMP system(s).

 

PHASE I: Investigate the existing open-source solutions in terms of the applicable programming frameworks (such as RAJA  and KOKKOS) for HF codes. Document potential gaps in existing performance analysis tools that limit their abilities to provide metrics of interest for HF codes on heterogenous systems (including those that provide shared/unified memory accesses across CPUs and APUs.)  Understand the benefits and limitations of shared/unified memory access feature and the accompanying programming model. Generate a feature-list for the envisioned toolkit tailored for HF codes and how it fits into an overall mature performance tools framework. Identify key concepts and methods that, when implemented, will provide non-intrusive cross-platform tools that can effectively operate on complex HF codes by collaborating with DTRA code developers. The proposers may use an open-source proxy code in Phase I to demonstrate the feasibility.

 

PHASE II: Develop a production ready cross-platform toolkit based on the Phase I approach and integrate within the overall tool framework. Demonstrate the use of the tools on DOD HPCMP systems on a broad range of HF codes that include compressible flow (blast, high-explosives, chemistry), incompressible flow (dispersion), multiphase flow (melting and evaporating particles), fluid-structure interaction, and large-deformation, explicit structural dynamics (cracking, spallation, contact).

 

PHASE III DUAL USE APPLICATIONS: The performance tools developed for use on very demanding application codes will be well suited, once refined, for use on more general HPC workloads on Exascale architectures. Improvements in this phase are expected to involve ease of use enhancements and hardening of the profiling tools for use on a wide range of application software used in Government research and industry.

 

REFERENCES:

  1. https://developer.nvidia.com/blog/simplifying-gpu-application-development-with-heterogeneous-memory-management/;
  2. https://computing.llnl.gov/projects/raja-managing-application-portability-next-generation-platforms;
  3. https://kokkos.org/;
  4. https://nowlab.cse.ohio-state.edu/static/media/workshops/presentations/espm2_23/PublicSC23ESPM2ProgrammingAMDInstinctMI300A.pdf;
  5. https://centers.hpc.mil/users/index.html#accounts;

 

KEYWORDS: High Performance Computing; HPC; Accelerated Processing Units; APU; Message Passing Interface; MPI; Open Multi-Processing; OpenMP; High-Fidelity; graphics processing unit; GPU;

US Flag An Official Website of the United States Government