Fabric Manager shared-nvswitch virtualization model support#166
Fabric Manager shared-nvswitch virtualization model support#166mresvanis wants to merge 7 commits intoNVIDIA:masterfrom
Conversation
|
@mresvanis I am interested in reviewing this PR, once it is ready please ping me. |
a487b1f to
0761aa3
Compare
Signed-off-by: Michail Resvanis <mresvani@redhat.com>
… image Signed-off-by: Michail Resvanis <mresvani@redhat.com>
Signed-off-by: Michail Resvanis <mresvani@redhat.com>
Signed-off-by: Michail Resvanis <mresvani@redhat.com>
0761aa3 to
41f840e
Compare
@alaypatel07 this PR is up for review. |
|
@mresvanis Thanks for updating the PR.
|
|
@fanzhangio thank you for bringing those points up! My intention was to start those discussions as soon as you had reviewed at least the direction we started with :)
That is indeed by design, as I thought I might be missing some use cases for this device plugin. But I agree — when the node is set up with FM shared-NVSwitch virtualization mode, letting kubelet randomly allocate GPUs is a no-go. I'll change this behavior from a warning to an error and add a clear log message explaining what's happening.
That is spot on, this is exactly our intention — start by ensuring that all partitions needed at any time can be activated properly, then move to a more robust solution: a reconciler watching Node PodResources that deactivates partitions based on observed workloads. I chose not to include the Node PodResources reconciler in this PR to keep it small and easily reviewable. That said, I'm happy to add it here if you think the complete solution belongs in a single PR. WDYT? |
41f840e to
1d728db
Compare
Signed-off-by: Michail Resvanis <mresvani@redhat.com>
…hen FM enabled Extract NUMA-based device selection into a standalone preferDevicesByNUMA method. When a partition manager is active, GetPreferredAllocation now delegates to it for FM-aware selection with NUMA locality; otherwise it falls back to the original NUMA-only logic. Add comprehensive tests for the FM-aware path covering partition matching, NUMA tie-breaking, error cases, unavailable GPUs, and must-include device ordering. Signed-off-by: Michail Resvanis <mresvani@redhat.com>
When the fabric manager is enabled, the Allocate handler now activates partitions for the requested device IDs before returning the allocation response, failing the request if the connection is lost or activation errors out. Signed-off-by: Michail Resvanis <mresvani@redhat.com>
1d728db to
a83992a
Compare
Summary
This PR adds the following changes:
Related NVIDIA GPU Operator changes: NVIDIA/gpu-driver-container#538 and NVIDIA/gpu-operator#2045
Changes
This change adds optional Fabric Manager support behind the ENABLE_FABRIC_MANAGER environment variable (disabled by default). When enabled, the device plugin:
New packages:
Test plan