Open-source Python toolkit for uncertainty-aware sensor-fusion benchmarking under domain shift and sensor degradation.
This repository is designed for research and reproducible benchmarking workflows, with a practical focus on:
- uncertainty-aware fusion evaluation
- domain-shift scenario stress testing
- sensor perturbation and fault injection
- reproducible benchmark reporting
- Multiple fusion methods:
inverse_variance(baseline)adaptive_reliability(disagreement-aware reliability weighting)counterfactual_consensus(novel method in this repo)
- Sensor perturbation engine:
gaussian_noisedropoutbias_drifttemporal_jitteruncertainty_inflation
- Scenario-driven domain shift benchmarking via JSON config
- Dataset support:
- synthetic generated dataset
- SQLite table dataset
- Metrics:
- Accuracy
- Negative Log-Likelihood (NLL)
- Expected Calibration Error (ECE)
- Uncertainty-Risk Correlation
- CLI tools for benchmark execution and demo DB creation
configs/
sample_benchmark.json
sqlite_benchmark.json
nuscenes_mini_benchmark.json
nuscenes_mini_benchmark_adaptive.json
nuscenes_mini_benchmark_novel.json
scripts/
build_nuscenes_sqlite.py
generate_result_graphs.py
examples/
run_sqlite_demo.sh
nuscenes_mini_results_baseline_again.json
nuscenes_mini_results_adaptive.json
nuscenes_mini_results_novel.json
figures/
nll_comparison.svg
uncertainty_correlation_comparison.svg
src/fusionbench/
bench/
metrics.py
runner.py
cli/
main.py
core/
types.py
utils.py
data/
synthetic.py
sqlite_store.py
domain_shift/
scenarios.py
fusion/
uncertainty.py
perturbations/
operators.py
tests/
pyproject.toml
README.md
- Python 3.9+
- macOS/Linux/Windows shell
python3 -m venv .venv
source .venv/bin/activatepip install -e .pip install -e .[dev]If your environment cannot download packages, you can still run this project directly:
PYTHONPATH=src python3 -m fusionbench.cli.main run --config configs/sample_benchmark.json --output examples/sample_results.jsonRun the included benchmark config:
fusionbench run \
--config configs/sample_benchmark.json \
--output examples/sample_results.jsonThe command prints JSON to terminal and writes output to examples/sample_results.json.
The advanced experiment in this README was run on nuScenes v1.0-mini converted into the toolkit SQLite format (sample_id, label, camera_*, lidar_*, radar_* columns).
- Converted DB path:
data/nuscenes_mini_samples.db - Conversion script:
scripts/build_nuscenes_sqlite.py - Note: the raw dataset itself is not committed to this repository.
fusionbench make-demo-db \
--db-path examples/demo_samples.db \
--n-samples 1500 \
--seed 11fusionbench run \
--config configs/sqlite_benchmark.json \
--output examples/sqlite_results.jsonbash examples/run_sqlite_demo.shRun benchmark from config.
fusionbench run --config <path/to/config.json> --output <path/to/results.json>Create a synthetic SQLite dataset.
fusionbench make-demo-db --db-path <db.sqlite> [--n-samples 1500] [--seed 42]Top-level JSON keys:
benchmarkdatasetscenarios
{
"name": "domain_shift_fusion_baseline",
"seed": 42,
"calibration_bins": 10,
"fusion_method": "inverse_variance"
}For advanced fusion methods you can also provide:
fusion_method:inverse_variance|adaptive_reliability|counterfactual_consensus- Method parameters, e.g.
uncertainty_power,disagreement_gain,agreement_gamma,counterfactual_beta
{
"source": "synthetic",
"n_samples": 1500,
"seed": 123
}{
"source": "sqlite",
"db_path": "examples/demo_samples.db",
"table": "samples",
"limit": 1000
}Each scenario supports an operations list of perturbations.
{
"name": "compound_night_shift",
"description": "Multimodal degradation under low light and weather",
"operations": [
{ "type": "gaussian_noise", "target": "camera", "std": 0.11 },
{ "type": "dropout", "target": "lidar", "probability": 0.15 },
{ "type": "bias_drift", "target": "radar", "offset": 0.05 },
{ "type": "temporal_jitter", "target": "all", "window": 4 },
{ "type": "uncertainty_inflation", "target": "all", "factor": 1.2 }
]
}- Accuracy: classification correctness under thresholded fused score
- NLL: probabilistic quality (lower is better)
- ECE: calibration gap between confidence and empirical accuracy (lower is better)
- Uncertainty-Risk Correlation: whether higher uncertainty tracks prediction failures
inverse_variance(baseline)adaptive_reliabilitycounterfactual_consensus(novel)
| Scenario | Method | Accuracy | NLL | ECE | Uncertainty-Risk Corr |
|---|---|---|---|---|---|
| baseline | inverse_variance | 0.655941 | 1.385254 | 0.322141 | -0.008785 |
| baseline | adaptive_reliability | 0.655941 | 0.711751 | 0.193716 | 0.338457 |
| baseline | counterfactual_consensus | 0.660891 | 1.259207 | 0.322773 | 0.259458 |
| weather_and_occlusion_shift | inverse_variance | 0.655941 | 1.431917 | 0.298746 | -0.028407 |
| weather_and_occlusion_shift | adaptive_reliability | 0.655941 | 0.785094 | 0.202290 | 0.181156 |
| weather_and_occlusion_shift | counterfactual_consensus | 0.658416 | 1.667270 | 0.305797 | 0.174910 |
adaptive_reliabilitygives the best probabilistic quality on this dataset (lower NLL, lower ECE).counterfactual_consensusimproves uncertainty-risk correlation substantially over baseline and slightly improves accuracy.- Under stronger perturbation,
counterfactual_consensuskeeps better uncertainty correlation than baseline, but NLL remains an open optimization target.
Run the unit test suite:
pytest- Keep
seedfixed in both benchmark and dataset configs. - Version-control your config files and output JSON.
- Compare scenarios using identical dataset and model settings.
- Plug in real sensor datasets by implementing an additional loader under
src/fusionbench/data/. - Add advanced fusion methods under
src/fusionbench/fusion/. - Add perturbation types (occlusion masks, blur kernels, packet loss models) in
src/fusionbench/perturbations/operators.py. - Add domain-specific metrics in
src/fusionbench/bench/metrics.py.
fusionbench: command not found- Ensure virtual environment is active and
pip install -e .completed successfully.
- Ensure virtual environment is active and
dataset.db_path is required- Set
dataset.sourcetosqliteonly whendb_pathis provided.
- Set
- SQLite table errors
- Confirm the target table exists and columns match expected schema.
- Holger Caesar et al., "nuScenes: A multimodal dataset for autonomous driving", CVPR 2020.
https://arxiv.org/abs/1903.11027 - nuScenes dataset website:
https://www.nuscenes.org/
MIT (see LICENSE).
Verified on March 4, 2026 with:
PYTHONPATH=src PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest -q->6 passedPYTHONPATH=src python3 -m fusionbench.cli.main run --config configs/sample_benchmark.json --output examples/sample_results.jsonPYTHONPATH=src python3 -m fusionbench.cli.main make-demo-db --db-path examples/demo_samples.db --n-samples 1500 --seed 11PYTHONPATH=src python3 -m fusionbench.cli.main run --config configs/sqlite_benchmark.json --output examples/sqlite_results.jsonpython3 scripts/build_nuscenes_sqlite.py --nuscenes-meta-dir data/v1.0-mini --out-db data/nuscenes_mini_samples.dbPYTHONPATH=src python3 -m fusionbench.cli.main run --config configs/nuscenes_mini_benchmark.json --output examples/nuscenes_mini_results_baseline_again.jsonPYTHONPATH=src python3 -m fusionbench.cli.main run --config configs/nuscenes_mini_benchmark_adaptive.json --output examples/nuscenes_mini_results_adaptive.jsonPYTHONPATH=src python3 -m fusionbench.cli.main run --config configs/nuscenes_mini_benchmark_novel.json --output examples/nuscenes_mini_results_novel.json