fix|build[cartesian]: Fix NVCC default flags for `-O0` and refactor configuration out of `config.py` by FlorianDeconinck · Pull Request #2524 · GridTools/gt4py

FlorianDeconinck · 2026-03-11T16:58:49Z

Description

We introduced a bad behavior by forcing FMA deactivation in -O0 in #2520 by passing a flag to nvcc that isn't readable.

While fixing the above we also moved the entire configuration of GPU device code & execution to the compiler.py flag to centralize and clean up the central config.py.

Everything remains functionally the same: environment variable have precedence on defaults and the entire GPU stack is setup out of CUDA_ROOT & cupy device.

E.g.:

Split environment variables for GPU source code compile flags off CXX flags
Move GPU compiler configuration out in compiler.py
Force FMA (-fmad=false) off for -O0

⚠️ I didn't had a unit test because the real system test would require a "fresh" load of gt4py with a different env (-O0 in GT4PY_COMPILE_OPT_LEVEL) - which is beyond the scope of pytest

Move GPU compiler configuration out in `compiler.py` Force FMA (`fmad`) off for `-O0`

romanc · 2026-03-11T20:23:59Z

cscs-ci run

romanc · 2026-03-12T08:19:43Z

cscs-ci run

Use separate `GT4PY_CARTESIAN_EXTRA_CUDA_COMPILE_ARGS` environment variable and not `GT4PY_EXTRA_COMPILE_ARGS` (same as CPU) because GPU compilers don't always follow the flag names of the CPU compilers. Adds a bit of (semi-related) cleanup: - names of enums are usually singular (e.g. *Name not *Names) - rely on type inference where possible

Split GPU source code compile flags off CXX flags

99d7d4b

Move GPU compiler configuration out in `compiler.py` Force FMA (`fmad`) off for `-O0`

FlorianDeconinck requested review from romanc and twicki March 11, 2026 16:58

romanc approved these changes Mar 12, 2026

View reviewed changes

romanc merged commit 2d2511a into GridTools:main Mar 12, 2026
19 checks passed

romanc deleted the fix/gpu_compilation branch March 12, 2026 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix|build[cartesian]: Fix NVCC default flags for `-O0` and refactor configuration out of `config.py`#2524

fix|build[cartesian]: Fix NVCC default flags for `-O0` and refactor configuration out of `config.py`#2524
romanc merged 2 commits intoGridTools:mainfrom
FlorianDeconinck:fix/gpu_compilation

FlorianDeconinck commented Mar 11, 2026

Uh oh!

romanc commented Mar 11, 2026

Uh oh!

romanc commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FlorianDeconinck commented Mar 11, 2026

Description

Uh oh!

romanc commented Mar 11, 2026

Uh oh!

romanc commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants