A4 & A4X TRTLLM GKE single-host inference benchmarking recipes by hmhv1222 · Pull Request #146 · AI-Hypercomputer/gpu-recipes

hmhv1222 · 2026-03-13T20:39:25Z

A4 & A4X TRTLLM GKE single-host inference benchmarking recipes, ReadMe files and config YAML files for benchmarking with certain configurations and parallelism hyperparameters.

Tested and validated on A4 and A4X GKE nodes for TRTLLM inference benchmarking with certain TP, PP, EP, number of GPU chips, input & output sequence length, precision.

Model YAML files only show a certain combination of parallelism hyperparameters and configs. Input and output length needs to be adjusted according to the model and its configs.

…Files

…hange default number of GPUs in A4 TRTLLM inference recipe

hmhv1222 added 2 commits March 13, 2026 20:36

A4 & A4X TRTLLM GKE Single-Host Inference Recipes, ReadMe and Config …

67630c6

…Files

Add benchmarking configs warning paragraphs in TRTLLM inference ReadMe

875e36b

hmhv1222 force-pushed the mhvictorhau-20260303-a4-a4x-trtllm-benchmarking branch from 3ba13b6 to 875e36b Compare March 13, 2026 20:42

Remove memory resources constraint in A4X serving-launcher.yaml and c…

5db07bb

…hange default number of GPUs in A4 TRTLLM inference recipe

hmhv1222 requested a review from Chris113113 March 13, 2026 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A4 & A4X TRTLLM GKE single-host inference benchmarking recipes#146

A4 & A4X TRTLLM GKE single-host inference benchmarking recipes#146
hmhv1222 wants to merge 3 commits intomainfrom
mhvictorhau-20260303-a4-a4x-trtllm-benchmarking

hmhv1222 commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hmhv1222 commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant