BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:The Rise of Portable GPU Programming: Experiences Developing GPU-B
 ased Scientific Simulation Applications for Intel\, NVIDIA\, and AMD GPUs 
 - Dr John Tramm\, Argonne National Lab
DTSTART:20221116T130000Z
DTEND:20221116T140000Z
UID:TALK192029@talks.cam.ac.uk
CONTACT:Jo Boyle
DESCRIPTION:Historically\, portability has not been important for GPU prog
 ramming as NVIDIA has dominated the high performance computing (HPC) GPU m
 arket. In this context\, it has always made sense to develop scientific HP
 C apps using NVIDIA’s proprietary CUDA programming model. However\, in 2
 022 both AMD and Intel are releasing HPC GPU products with the intention o
 f competing directly with NVIDIA. In fact\, the world’s first exascale s
 upercomputer (Oak Ridge National Laboratory’s Frontier) is powered by AM
 D GPUs\, with another even larger exascale supercomputer (Aurora) powered 
 by Intel GPUs set to arrive at Argonne National Laboratory shortly. These 
 new computers highlight a trend not just from CPU to GPU in HPC\, but also
  a trend from proprietary CUDA into a number of different portable perform
 ance models for GPU. Thus\, scientific application developers are now conf
 ronted with not only the difficultly of porting or developing apps for GPU
  architectures\, but also with selecting from a wide variety of portable G
 PU programming models (for instance\, OpenMP offloading\, HIP\, SYCL/DPC++
 \, OpenCL\, Kokkos\, and RAJA).\n  \nIn this talk\, I will briefly introdu
 ce the newest supercomputing systems and will give an overview of the many
  different portable performance models now available for GPUs. I will show
  a few snippets of an example kernel implemented in a variety of different
  models\, and will even compare performance of a scientific mini-app\, XSB
 ench\, across all major programming models and GPU architectures. Subjecti
 ve “pros and cons” of each programming model will be discussed along w
 ith quantitative performance comparisons. Next\, I will use a full scienti
 fic GPU application (the OpenMC Monte Carlo particle transport code) as a 
 case study to discuss real-world issues affecting portable scientific GPU 
 applications and how bleeding-edge GPU compiler technology stacks are fari
 ng. I will also briefly discuss a few of the algorithmic performance optim
 izations that were developed for OpenMC to give a feel for what types of c
 hanges are required to achieve high performance on modern GPUs. \n\n\nFor 
 further information above this talk please email Dr Paul Cosgrove: pmc55@c
 am.ac.uk\n\n
LOCATION:Department of Engineering - Lecture Theatre 6
END:VEVENT
END:VCALENDAR
