feat(deployment): centerpoint deployment integration by vividf · Pull Request #181 · tier4/AWML

vividf · 2026-02-02T10:24:31Z

Summary

Integrates CenterPoint into the unified deployment framework, enabling deployment and evaluation of ONNX and TensorRT models.

Note, this PR include changes in #180

Changes

Integrated CenterPoint with deployment framework:
- Moved deployment code from projects/CenterPoint to deployment/projects/centerpoint
- Implemented component-based export pipeline for ONNX and TensorRT
- Added runtime inference support with PyTorch, ONNX Runtime, and TensorRT backends
Deployment capabilities:
- Export CenterPoint models to ONNX format
- Export CenterPoint models to TensorRT engines
- Component-based architecture (voxel encoder, backbone+head) for flexible deployment
Evaluation capabilities:
- Evaluate ONNX models using ONNX Runtime
- Evaluate TensorRT engines
- Integrated metrics evaluation with deployment pipeline
Updated CLI: Replaced old deploy.py script with new unified CLI (deployment.cli.main)
Added Docker support: Created Dockerfile for deployment environment with TensorRT dependencies
Updated documentation: Added deployment and evaluation instructions in README

Migration Notes

Old deployment script (projects/CenterPoint/scripts/deploy.py) is removed
Use new CLI: python -m deployment.cli.main centerpoint <deploy_config> <model_config>
ONNX model variants are now registered via deployment.projects.centerpoint.onnx_models

How to run

python -m deployment.cli.main centerpoint deployment/projects/centerpoint/config/deploy_config.py   projects/CenterPoint/configs/t4dataset/Centerpoint/second_secfpn_8xb16_121m_j6gen2_base_amp_t4metric_v2.py   --rot-y-axis-reference

Exported ONNX (Same)

Voxel Encoder

Backbone Head

KSeangTan

Done the first round of reviewing, please consider to use dataclass and pydantic for configs, and do type checking there.

Therefore, we can remove all the type checking in the code

deployment/projects/centerpoint/config/deploy_config.py

KSeangTan · 2026-03-02T03:14:40Z

deployment/projects/centerpoint/config/deploy_config.py

 verification = dict(
    enabled=False,
-    tolerance=1e-1,
+    tolerance=1,


Explain what is tolerance here, and why updating from 0.1 to 1

The value was originally set for calibration classification and later copied to CenterPoint, but it does not work correctly for CenterPoint.

INFO:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) latency: 205.08 ms INFO:deployment.core.evaluation.verification_mixin: output[heatmap]: shape=(1, 5, 510, 510), max_diff=0.070197, mean_diff=0.007674 INFO:deployment.core.evaluation.verification_mixin: output[reg]: shape=(1, 2, 510, 510), max_diff=0.007944, mean_diff=0.001120 INFO:deployment.core.evaluation.verification_mixin: output[height]: shape=(1, 1, 510, 510), max_diff=0.025401, mean_diff=0.002122 INFO:deployment.core.evaluation.verification_mixin: output[dim]: shape=(1, 3, 510, 510), max_diff=0.031920, mean_diff=0.001143 INFO:deployment.core.evaluation.verification_mixin: output[rot]: shape=(1, 2, 510, 510), max_diff=0.075215, mean_diff=0.004582 INFO:deployment.core.evaluation.verification_mixin: output[vel]: shape=(1, 2, 510, 510), max_diff=0.221999, mean_diff=0.004940 INFO:deployment.core.evaluation.verification_mixin: Overall Max difference: 0.221999 INFO:deployment.core.evaluation.verification_mixin: Overall Mean difference: 0.004347 WARNING:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) verification FAILED ✗ (max diff: 0.221999 > tolerance: 0.100000) INFO:deployment.core.evaluation.verification_mixin:

Do you know any reason why it fail? Since it seems like a verification, it's always better to check the reason rather than update the tolerance

It doesn't necessarily indicate a failure.
When converting from PyTorch to TensorRT, some numerical differences are expected due to different kernels, precision handling, and TensorRT optimizations.

The verification is mainly used as a safeguard to detect major issues (e.g., incorrect conversion settings) rather than to enforce exact numerical equivalence.

Since 1e-1 is when we set for resnet18 for calibration classification, it is different in the cases.

Anyway, 5e-1 can be a better value

Running onnx (cuda:0) reference... 2026-03-10 15:20:07.511273431 [V:onnxruntime:, execution_steps.cc:103 Execute] stream 0 activate notification with index 0 2026-03-10 15:20:07.567219724 [V:onnxruntime:, execution_steps.cc:47 Execute] stream 0 wait on Notification with id: 0 INFO:deployment.core.evaluation.verification_mixin: onnx (cuda:0) latency: 1423.80 ms INFO:deployment.core.evaluation.verification_mixin: Running tensorrt (cuda:0) test... INFO:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) latency: 1141.26 ms INFO:deployment.core.evaluation.verification_mixin: output[heatmap]: shape=(1, 5, 510, 510), max_diff=0.464849, mean_diff=0.056135 INFO:deployment.core.evaluation.verification_mixin: output[reg]: shape=(1, 2, 510, 510), max_diff=0.056639, mean_diff=0.006198 INFO:deployment.core.evaluation.verification_mixin: output[height]: shape=(1, 1, 510, 510), max_diff=0.227012, mean_diff=0.065522 INFO:deployment.core.evaluation.verification_mixin: output[dim]: shape=(1, 3, 510, 510), max_diff=0.336713, mean_diff=0.028087 INFO:deployment.core.evaluation.verification_mixin: output[rot]: shape=(1, 2, 510, 510), max_diff=0.515039, mean_diff=0.023962 INFO:deployment.core.evaluation.verification_mixin: output[vel]: shape=(1, 2, 510, 510), max_diff=0.932002, mean_diff=0.034206 INFO:deployment.core.evaluation.verification_mixin: Overall Max difference: 0.932002 INFO:deployment.core.evaluation.verification_mixin: Overall Mean difference: 0.037279 WARNING:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) verification FAILED ✗ (max diff: 0.932002 > tolerance: 0.500000)

On a different computer, it can have different values.
I will leave 1 for now

Did you set any random seed to set this validation since the randomness (for example, shuffling pointclouds) significantly affects the results. Otherwise, i believe the difference between computer is too huge

Note that the reported difference corresponds to the maximum deviation; the mean difference is actually quite small.

Additionally, the magnitude of the difference depends heavily on the hardware. For example, on Blackwell GPUs (ONNX CUDA vs. TensorRT), the discrepancy is minimal. In contrast, on my laptop, the difference between ONNX CUDA and TensorRT is around 1. Even when forcing ONNX Runtime to use CUDA only, it still initializes a default CPU executor and executes some operations on the CPU, which can introduce discrepancies.

Interestingly, when comparing ONNX CPU with TensorRT on my laptop, the difference becomes very small. However, on Blackwell, the ONNX CPU vs. TensorRT comparison shows a larger gap.

Let's put a TODO here to investigate the issue in another PR.

deployment/projects/centerpoint/export/component_extractor.py

deployment/projects/centerpoint/onnx_models/__init__.py

deployment/projects/centerpoint/pipelines/centerpoint_pipeline.py

deployment/projects/centerpoint/pipelines/onnx.py

KSeangTan · 2026-03-11T03:55:52Z

Some of the modules, for example, dataloader should be able to be reused for the same detection3d tasks right?

KSeangTan · 2026-03-11T03:57:21Z

deployment/projects/centerpoint/entrypoint.py

+    model_cfg = Config.fromfile(args.model_cfg)
+    config = BaseDeploymentConfig(deploy_cfg)
+
+    _validate_required_components(config.components_cfg)


move _validate_required_components to BaseDeploymentConfig

This only validates the needed name for Centerpoint

Why it only validates needed names for CenterPoint? It should be a function to validate needed names for different models right?

deployment/projects/centerpoint/entrypoint.py

deployment/projects/centerpoint/pipelines/tensorrt.py

vividf · 2026-03-25T16:25:40Z

@KSeangTan
Thanks for the detailed review!!

Some of the modules, for example, dataloader should be able to be reused for the same detection3d tasks right?

Regarding this, I would like to change those names that can be reused for bevfusion in other PR

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

…erpoint Signed-off-by: vividf <yihsiang.fang@tier4.jp>

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

KSeangTan

Thanks for the work, and please address the comments accordingly.

deployment/projects/centerpoint/config/deploy_config.py

KSeangTan · 2026-04-08T05:47:10Z

deployment/projects/centerpoint/config/deploy_config.py

 verification = dict(
    enabled=False,
-    tolerance=1e-1,
+    tolerance=1,


Let's put a TODO here to investigate the issue in another PR.

KSeangTan · 2026-04-08T05:48:20Z

deployment/projects/centerpoint/eval/evaluator.py

+        model_cfg: Config,
+        metrics_config: Detection3DMetricsConfig,
+        components_cfg: ComponentsConfig,
+    ):


KSeangTan · 2026-04-08T05:49:24Z

deployment/projects/centerpoint/eval/evaluator.py

+
+        self._components_cfg = components_cfg
+
+        task_profile = TaskProfile(


Try to avoid magic string

deployment/projects/centerpoint/eval/evaluator.py

KSeangTan · 2026-04-08T10:37:56Z

deployment/projects/centerpoint/io/sample_adapter.py

+
+        input_features, voxel_dict = model._extract_features(data_loader, sample_idx)
+
+        if not isinstance(input_features, torch.Tensor):


is it possible that input_features are not torch.Tensor? Otherwise, please consider to use assert

KSeangTan · 2026-04-08T10:39:27Z

deployment/projects/centerpoint/pipelines/centerpoint_pipeline.py

+        Raises:
+            ValueError: If class_names not found in pytorch_model.cfg.
+        """
+        cfg = getattr(pytorch_model, "cfg", None)


Why do we use getattr? We can simply call pytorch_model.cfg right?

KSeangTan · 2026-04-08T10:40:02Z

deployment/projects/centerpoint/pipelines/centerpoint_pipeline.py

+            device=device,
+        )
+
+        self.num_classes: int = len(class_names)


Can remove int if the typing hint is clear

KSeangTan · 2026-04-08T10:41:43Z

deployment/projects/centerpoint/pipelines/onnx.py

+
+        # Select execution providers based on device
+        providers = self.device.to_ort_provider()
+        if self.device.is_cuda:


device_message = "CUDA" self.device.is_cuda else "CPU" logger.info(f"Using {device_message} execution provider for ONNX")

This is cleaner.

KSeangTan · 2026-04-08T10:44:04Z

deployment/projects/centerpoint/entrypoint.py

+    model_cfg = Config.fromfile(args.model_cfg)
+    config = BaseDeploymentConfig(deploy_cfg)
+
+    _validate_required_components(config.components_cfg)


Why it only validates needed names for CenterPoint? It should be a function to validate needed names for different models right?

vividf changed the title ~~Feat/centerpoint deployment integration~~ feat(deployment): centerpoint deployment integration Feb 2, 2026

vividf mentioned this pull request Feb 2, 2026

feat(centerpoint): integrate CenterPoint into unified deployment pipeline #161

Closed

vividf requested review from KSeangTan and yamsam February 2, 2026 16:33

vividf self-assigned this Feb 2, 2026

vividf marked this pull request as ready for review February 3, 2026 04:31

vividf force-pushed the feat/centerpoint_deployment_integration branch 2 times, most recently from bfb778f to 441d06e Compare February 16, 2026 06:08

KSeangTan requested changes Mar 2, 2026

View reviewed changes

vividf force-pushed the feat/centerpoint_deployment_integration branch from caa92a6 to 93e5558 Compare March 5, 2026 17:24

vividf changed the base branch from feat/new_deployment_and_evaluation_pipeline to main March 5, 2026 17:27

vividf changed the base branch from main to feat/new_deployment_and_evaluation_pipeline March 5, 2026 17:27

vividf force-pushed the feat/centerpoint_deployment_integration branch 3 times, most recently from de7020e to 6470ac5 Compare March 10, 2026 14:40

KSeangTan reviewed Mar 11, 2026

View reviewed changes

deployment/projects/centerpoint/entrypoint.py Show resolved Hide resolved

KSeangTan reviewed Mar 11, 2026

View reviewed changes

deployment/projects/centerpoint/pipelines/tensorrt.py Outdated Show resolved Hide resolved

vividf requested a review from KSeangTan March 11, 2026 04:01

KSeangTan requested changes Mar 11, 2026

View reviewed changes

KSeangTan reviewed Mar 11, 2026

View reviewed changes

deployment/projects/centerpoint/pipelines/tensorrt.py Outdated Show resolved Hide resolved

vividf force-pushed the feat/new_deployment_and_evaluation_pipeline branch from 5256306 to 2b28f60 Compare March 11, 2026 04:27

vividf force-pushed the feat/centerpoint_deployment_integration branch from 1ca0e1c to a6b9840 Compare March 11, 2026 04:28

vividf force-pushed the feat/centerpoint_deployment_integration branch 3 times, most recently from a209d2b to 90d1404 Compare March 25, 2026 15:23

vividf requested a review from KSeangTan March 25, 2026 16:23

vividf force-pushed the feat/centerpoint_deployment_integration branch from 23afb89 to a67a3e5 Compare March 27, 2026 05:30

vividf added 27 commits March 28, 2026 02:02

chore: clean code

9076534

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: update threshold for centerpoint

92fae7c

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: clean up code

36662d1

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: refactor base config - centerpoint

c77e548

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: clean up code: device spec, remove unused fucntion .etc - cent…

bc51ed8

…erpoint Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix Any

227fa82

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: add docstring

2f93f33

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: refactor export compenent - centerpoint

e3717ca

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix more Device spec - centerpoint

687480a

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix

e6864cf

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: add more docstring

55b37ca

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: change file name

dc5d845

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: remove redundant check

2c0c385

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: orangize directory

b8a7452

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: rename sample file

d098217

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: remove init

bbd3b42

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: add init back

a439d98

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix trt verification

e449752

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: update deploy config

eb982e3

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix more deploy config

8b47dfc

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: update deploy config

02a2fdb

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: for loop for clean code

c93d54a

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: remove duplicate code

f36ab0e

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: clean up sample adapter

a6bdb02

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

clean code

465e1de

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: clean up centerpoint

f73cdaa

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: replace pring to logging

15ba5d4

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

vividf force-pushed the feat/centerpoint_deployment_integration branch from feffe48 to 15ba5d4 Compare March 27, 2026 17:03

chore: remove conplex discover files logic, and directly use components

31dd030

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

KSeangTan requested changes Apr 8, 2026

View reviewed changes


		self._components_cfg = components_cfg

		task_profile = TaskProfile(


		input_features, voxel_dict = model._extract_features(data_loader, sample_idx)

		if not isinstance(input_features, torch.Tensor):

Conversation

vividf commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Migration Notes

How to run

Exported ONNX (Same)

Uh oh!

KSeangTan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vividf Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KSeangTan commented Mar 11, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vividf commented Mar 25, 2026

Uh oh!

KSeangTan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

vividf commented Feb 2, 2026 •

edited

Loading

KSeangTan left a comment •

edited

Loading

vividf Mar 4, 2026 •

edited

Loading