Skip to content

Fix AMD APU VIS_VRAM detection: swapped constants, None crash, and code duplication#3

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/fix-issues-in-gpustack-runtime-9
Draft

Fix AMD APU VIS_VRAM detection: swapped constants, None crash, and code duplication#3
Copilot wants to merge 2 commits intomainfrom
copilot/fix-issues-in-gpustack-runtime-9

Conversation

Copy link

Copilot AI commented Mar 15, 2026

Four bugs identified in PR review (Gemini + Copilot) of the AMD APU unified-memory VRAM detection feature.

Fixes

  • Swapped memory type constants (pyrocmsmi/__init__.py): RSMI_MEM_TYPE_VIS_VRAM was 2 and RSMI_MEM_TYPE_GTT was 1 — inverted vs. the ROCm SMI spec (rsmi_memory_type_t enum: VIS_VRAM=1, GTT=2). The APU fix was silently querying GTT instead of VIS_VRAM.

  • NameError in memory function defaults (pyrocmsmi/__init__.py): rsmi_dev_memory_total_get / rsmi_dev_memory_usage_get defaulted to rsmi_memory_type_t.RSMI_MEM_TYPE_VRAM, which is only in scope when rsmiBindings imports successfully (guarded by a conditional block). Replaced with the always-available numeric constant RSMI_MEM_TYPE_VRAM = 0.

  • TypeError on None VRAM values (amd.py): amdsmi_get_gpu_vram_usage().get("vram_total") can return None, causing max(None, dev_mem_vis_vram) to raise TypeError. Normalized with or 0.

  • Duplicated VIS_VRAM correction logic (amd.py): The VIS_VRAM override block was copy-pasted into both the pyamdsmi try branch and the pyrocmsmi fallback except branch. Moved it to after the try/except so it applies once regardless of which path resolved dev_mem.

# Before: duplicated in both try and except
try:
    dev_mem = dev_gpu_vram_usage.get("vram_total")  # could be None
    with contextlib.suppress(pyrocmsmi.ROCMSMIError):  # duplicated block
        dev_mem_vis_vram = byte_to_mebibyte(pyrocmsmi.rsmi_dev_memory_total_get(dev_idx, RSMI_MEM_TYPE_VIS_VRAM))
        dev_mem = max(dev_mem, dev_mem_vis_vram)  # TypeError if dev_mem is None
except pyamdsmi.AmdSmiException:
    dev_mem = byte_to_mebibyte(pyrocmsmi.rsmi_dev_memory_total_get(dev_idx))
    with contextlib.suppress(pyrocmsmi.ROCMSMIError):  # same block again
        ...

# After: normalized inputs, single correction after try/except
try:
    dev_mem = dev_gpu_vram_usage.get("vram_total") or 0
    ...
except pyamdsmi.AmdSmiException:
    dev_mem = byte_to_mebibyte(pyrocmsmi.rsmi_dev_memory_total_get(dev_idx))
    ...
# Applied once to whichever path set dev_mem
with contextlib.suppress(pyrocmsmi.ROCMSMIError):
    dev_mem_vis_vram = byte_to_mebibyte(pyrocmsmi.rsmi_dev_memory_total_get(dev_idx, RSMI_MEM_TYPE_VIS_VRAM))
    dev_mem = max(dev_mem, dev_mem_vis_vram)

💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

Co-authored-by: Readon <3614708+Readon@users.noreply.github.com>
Copilot AI changed the title [WIP] [gpustack/runtime#9] Fix issues found by gemini-code-assist and copilot Fix AMD APU VIS_VRAM detection: swapped constants, None crash, and code duplication Mar 15, 2026
Copilot AI requested a review from Readon March 15, 2026 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants