fix(sandbox/bootstrap): GPU Landlock baseline paths and CDI spec missing diagnosis#710
Open
fix(sandbox/bootstrap): GPU Landlock baseline paths and CDI spec missing diagnosis#710
Conversation
…k baseline Landlock READ_FILE/WRITE_FILE restricts open(2) on character device files even when DAC permissions would otherwise allow it. GPU sandboxes need /dev/nvidiactl, /dev/nvidia-uvm, /dev/nvidia-uvm-tools, /dev/nvidia-modeset, and per-GPU /dev/nvidiaX nodes in the policy to allow NVML initialization. Additionally, CDI bind-mounts /run/nvidia-persistenced/socket into the container. NVML tries to connect to this socket at init time; if the directory is not in the landlock policy, it receives EACCES (not ECONNREFUSED), which causes NVML to abort with NVML_ERROR_INSUFFICIENT_PERMISSIONS even though nvidia-persistenced is optional. Both classes of paths are auto-added to the baseline when /dev/nvidiactl is present. Per-GPU device nodes are enumerated at runtime to handle multi-GPU configurations.
…eline NVML only needs to traverse the directory and connect to the Unix socket — it does not create or modify files there. Read-only (traversal) access is sufficient; read-write was unnecessarily broad.
When Docker has CDI configured but no CDI spec files exist on the host, container startup fails with an opaque error. Add a failure pattern to detect this and surface actionable recovery steps pointing to `nvidia-ctk cdi generate`.
…U and proxy mode After the Landlock per-path fix (#677), missing paths no longer silently disable the entire ruleset. This exposed two gaps: Proxy baseline: - /proc and /dev/urandom are needed by virtually every process (already in restrictive_default_policy but missing from the enrichment baseline). GPU baseline: - /proc must be read-write because CUDA writes to /proc/<pid>/task/<tid>/comm during cuInit() to set thread names. Using /proc (not /proc/self/task) because Landlock rules bind to inodes and child processes have different procfs inodes than the parent shell. Also skips chown for virtual filesystem paths (/proc, /sys) in prepare_filesystem since they are kernel-managed and may contain symlinks (e.g. /proc/self) that trigger the symlink safety check.
…are_filesystem chown on /proc and /sys succeeds silently (kernel ignores it for virtual filesystems), so the special-case skip added in the previous commit is not needed.
pimlock
commented
Apr 1, 2026
Comment on lines
+893
to
+894
| "/proc", | ||
| "/dev/urandom", |
Collaborator
Author
There was a problem hiding this comment.
| /// user working directory and temporary files. | ||
| const PROXY_BASELINE_READ_WRITE: &[&str] = &["/sandbox", "/tmp"]; | ||
|
|
||
| /// Fixed read-only paths required when a GPU is present. |
Collaborator
Author
There was a problem hiding this comment.
This whole section is similar to the baseline stuff, but activated when the GPU passthrough is on.
This could benefit from the mechanism we will have for providers bringing in their policies, though this is different as it applies sandbox wide and is not limited by binaries.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes GPU sandbox failures and general Landlock gaps exposed after the per-path enforcement fix (#677) landed.
Related Issue
Related to #398, follow-on to #503 and #677
Background
When the CDI injection PR (#503) was tested, it appeared to work because of an unrelated Landlock bug (#677, fixed Mar 30): a missing path (
/app) in the baseline causedPathFd::new()to abort the entire Landlock ruleset under BestEffort mode, silently disabling all filesystem restrictions. Once #677 fixed that, Landlock was properly enforced — exposing multiple missing paths.Changes
Proxy baseline (
PROXY_BASELINE_READ_ONLY)Added
/procand/dev/urandom— virtually every process needs these (already inrestrictive_default_policy()but missing from the enrichment baseline).GPU baseline (
GPU_BASELINE_READ_WRITE_FIXED)/dev/nvidiactl,/dev/nvidia-uvm,/dev/nvidia-uvm-tools,/dev/nvidia-modeset+ dynamically enumerated/dev/nvidiaX— NVML/CUDA open these withO_RDWR./proc(read-write): CUDA writes to/proc/<pid>/task/<tid>/commduringcuInit()to set thread names. Without write access,cuInit()returns error 304 (cudaErrorOperatingSystem). Must use/procrather than/proc/self/taskbecause Landlock rules bind to inodes —/proc/self/taskin the pre_exec hook resolves to the shell's PID, but child processes have different procfs inodes.GPU baseline (
GPU_BASELINE_READ_ONLY_FIXED)/run/nvidia-persistenced: NVML tries to connect to the persistenced socket; if blocked by Landlock it getsEACCES(notECONNREFUSED) and aborts. Only read/traversal access needed.CDI specs missing diagnosis
Testing
mise run pre-commitpassesnvidia-smiverified working in GPU sandboxsandbox connectinteractive shell/dev/urandom) verified workingChecklist