Skip to content

Introduce benchmark framework using CUDA events#157

Open
mcgibbon wants to merge 3 commits intoNVIDIA:mainfrom
mcgibbon:feature/benchmark-framework
Open

Introduce benchmark framework using CUDA events#157
mcgibbon wants to merge 3 commits intoNVIDIA:mainfrom
mcgibbon:feature/benchmark-framework

Conversation

@mcgibbon
Copy link
Contributor

@mcgibbon mcgibbon commented Mar 12, 2026

This PR adds timing for the SHT and for the torch implementation of DISCO convolution through a new benchmarking framework, run through python -m torch_harmonics.benchmark.

This is largely taken from the implementation we used/I authored in https://github.com/ai2cm/ace

mcgibbon and others added 3 commits March 12, 2026 15:30
Introduce a torch_harmonics.benchmark subpackage with:
- Timer infrastructure (CUDATimer, NullTimer, CPUEventPair) for GPU
  event-based and CPU wall-clock timing
- BenchmarkABC base class with registry via @register_benchmark
- CLI runner (python -m torch_harmonics.benchmark) that saves JSON results
- RealSHT and InverseRealSHT benchmarks at 1-degree resolution

Also add benchmark_results to .gitignore.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register a disco_conv_s2_torch_1deg benchmark at 1-degree resolution
(B=4, 4 channels, 180x360) using the non-optimized torch contraction
path, which does not require the custom CUDA extension.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce hardware.py with a device-name-to-scale-factor lookup table
so benchmark batch sizes adapt to different GPUs. Base batch sizes are
tuned for Tesla T4 (factor 1.0). Unknown devices default to 1.0 with
a warning to add an entry for their hardware.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mcgibbon
Copy link
Contributor Author

@azrael417 you may find this helpful to check the SHT timings on your hardware, for #155. You'll want to insert new batch size scaling factors to fully occupy the hardware. I tried to make it straightforward to add new benchmarks.

The entrypoint will create git-tag labelled json files under benchmark_results/ in the directory you run it from (location modifiable by flag).

@azrael417
Copy link
Collaborator

Hello Jeremy, thanks for putting this together.
I have added multiple things to the MR. This is what I added:

  • backward benchmark
  • device selection support
  • batch size override
    Can you please have a look and see if you are OK with it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants