Skip to content

A4 & A4X TRTLLM GKE single-host inference benchmarking recipes#146

Open
hmhv1222 wants to merge 3 commits intomainfrom
mhvictorhau-20260303-a4-a4x-trtllm-benchmarking
Open

A4 & A4X TRTLLM GKE single-host inference benchmarking recipes#146
hmhv1222 wants to merge 3 commits intomainfrom
mhvictorhau-20260303-a4-a4x-trtllm-benchmarking

Conversation

@hmhv1222
Copy link
Collaborator

A4 & A4X TRTLLM GKE single-host inference benchmarking recipes, ReadMe files and config YAML files for benchmarking with certain configurations and parallelism hyperparameters.

Tested and validated on A4 and A4X GKE nodes for TRTLLM inference benchmarking with certain TP, PP, EP, number of GPU chips, input & output sequence length, precision.

Model YAML files only show a certain combination of parallelism hyperparameters and configs. Input and output length needs to be adjusted according to the model and its configs.

@hmhv1222 hmhv1222 force-pushed the mhvictorhau-20260303-a4-a4x-trtllm-benchmarking branch from 3ba13b6 to 875e36b Compare March 13, 2026 20:42
…hange default number of GPUs in A4 TRTLLM inference recipe
@hmhv1222 hmhv1222 requested a review from Chris113113 March 13, 2026 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant