[PERF]: Faster void * conversion by mdboom · Pull Request #1616 · NVIDIA/cuda-python

mdboom · 2026-02-12T19:20:02Z

We currently accept an int, CUdeviceptr, or a buffer-providing object as convertible to a void *. This is currently handled with a class _HelperInputVoidPtr, which mainly exists to manage the lifetime when the input exposes a buffer.

This object (like all PyObjects) is allocated on the heap and gets free'd implicitly by Cython at the end of the function. Since it only exists to manage lifetimes when the object exposes a buffer, we pay this heap allocation penalty even in the common case where the input is a simple integer.

This changes the code to statically allocate the Py_buffer on the stack, and so is faster for similar reasons to #1545. This means we are trading some stack space (88 bytes) for speed. But given that CUDA Python API calls can't recursively call themselves, I'm not concerned.

This improves the overhead time in the benchmark in #659 from 2.97us/call to 2.67us/call.

The old _HelperInputVoidPtr class stays around here because it is still useful when the input is a list of void *-convertible things and we can't statically determine how much space to allocate.

copy-pr-bot · 2026-02-12T19:20:06Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

mdboom · 2026-02-12T19:50:05Z

/ok to test

github-actions · 2026-02-12T20:03:27Z

Doc Preview CI
🚀 View preview at https://nvidia.github.io/cuda-python/pr-preview/pr-1616/
https://nvidia.github.io/cuda-python/pr-preview/pr-1616/cuda-core/
https://nvidia.github.io/cuda-python/pr-preview/pr-1616/cuda-bindings/
https://nvidia.github.io/cuda-python/pr-preview/pr-1616/cuda-pathfinder/
Preview will be ready when the GitHub Pages deployment is complete.

leofang · 2026-02-13T15:04:08Z

cuda_bindings/cuda/bindings/_lib/utils.pxi.in

-        elif isinstance(ptr, (_driver["CUdeviceptr"])):
-            self._cptr = <void*><void_ptr>int(ptr)


Q: This path seems to be gone?

leofang · 2026-02-13T15:09:58Z

cuda_bindings/cuda/bindings/_lib/utils.pxd.in

+cdef void * _helper_input_void_ptr(ptr, _HelperInputVoidPtrStruct *buffer)
+
+cdef inline void * _helper_input_void_ptr_free(_HelperInputVoidPtrStruct *helper):
+    if helper[0]._pybuffer.buf != NULL:


Q: Should we check first if helper is NULL?

leofang · 2026-02-13T15:12:53Z

cuda_bindings/cuda/bindings/_lib/utils.pxi.in

-            self._cptr = NULL
-        elif isinstance(ptr, (int)):
-            # Easy run, user gave us an already configured void** address
+        try:


It seems we can avoid code duplication by replacing the try-except block with a call to the new helper like this?

self._cptr = _helper_input_void_ptr(ptr, <_HelperInputVoidPtrStruct*><PyObject*>self)

(I'm not so sure about the self casting, I think it's correct because they share the same layout.)

mdboom added 3 commits February 12, 2026 14:00

[PERF]: Faster void * conversion

143b128

Restore cptr property

2350e36

Fixes in runtime.pyx.in

a992e59

Merge branch 'main' into faster-conversion

0462a0d

leofang reviewed Feb 13, 2026

View reviewed changes

leofang added this to the cuda.bindings 13.1.2 & 12.9.6 milestone Feb 14, 2026

leofang assigned mdboom Feb 14, 2026

leofang added enhancement Any code-related improvements cuda.bindings Everything related to the cuda.bindings module P1 Medium priority - Should do labels Feb 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PERF]: Faster void * conversion#1616

[PERF]: Faster void * conversion#1616
mdboom wants to merge 4 commits intoNVIDIA:mainfrom
mdboom:faster-conversion

mdboom commented Feb 12, 2026

Uh oh!

copy-pr-bot bot commented Feb 12, 2026

Uh oh!

mdboom commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

leofang Feb 13, 2026

Uh oh!

leofang Feb 13, 2026

Uh oh!

leofang Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		elif isinstance(ptr, (_driver["CUdeviceptr"])):
		self._cptr = <void*><void_ptr>int(ptr)

Conversation

mdboom commented Feb 12, 2026

Uh oh!

copy-pr-bot bot commented Feb 12, 2026

Uh oh!

mdboom commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

leofang Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

leofang Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

leofang Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants