In the realm of high-performance computing (HPC), the jump from theory to practical application is best understood through rigorous case studies. Following up on last year's exploration of general multi-GPU workflows, this session narrows the focus to a specific, non-linear challenge common in computational fluid dynamics (CFD): the 2D Burger's equation.
While previous examples often rely on simpler linear models like the Poisson equation, the Burger's equation introduces convection and diffusion terms that better capture non-linear effects, complicating the types of phenomenon that are scaled by GPU compute. This talk will dissect the implementation of a multi-GPU solver designed to tackle this equation, with a primary focus on the trade-offs between precision and performance, using open-source tools written in Python, Fortran, and/or C/C++, running natively on Linux!
I will present a detailed analysis of how the choice of grid size directly impacts the time to computation. I will explore:
- Domain Decomposition: How to effectively split the 2D grid across multiple GPUs to maximize parallel efficiency.
- Scaling Analysis: Real-world benchmarks showing where adding more GPUs yields diminishing returns versus where it unlocks massive speedups for finer grid resolutions.
- Bottleneck Identification: How changing the grid size shifts the bottleneck from compute to communication (interconnect bandwidth) and how to mitigate this.
Whether you are optimizing engineering simulations or building scientific software, this session will provide concrete data and actionable patterns for scaling physics solvers on modern multi-GPU hardware.



