pppms is an R package for post-selection inference in predictive
modeling. It implements multiplicity-adjusted bootstrap tilting methods
for lower confidence limits for prediction performance after selecting
the empirically best candidate model.
The methods implemented in this package originate from the dissertation
Rink, P. (2025). Confidence Limits for Prediction Performance. PhD thesis, University of Bremen, https://doi.org/10.26092/elib/3822
The package is intended as a methods package for post-selection inference in predictive modeling.
In many predictive modeling workflows several candidate models are trained and compared using the same evaluation data.
Typical workflow:
- Fit multiple candidate models
- Estimate their prediction performance
- Select the empirically best model
- Report its estimated performance
However, this procedure ignores the uncertainty introduced by the model selection step. Selecting the best model among several candidates inflates the observed performance and can lead to overly optimistic conclusions.
pppms provides statistically valid lower confidence limits for
prediction performance that explicitly account for model selection.
# install.packages("remotes")
remotes::install_github("pascalrink/pppms")library(pppms)
true_labels <- c(0,0,1,1,0,1)
pred_labels <- cbind(
model1 = c(0,0,1,1,1,1),
model2 = c(0,1,1,0,0,1)
)
res <- MabtCI(
true_labels,
pred_labels,
B = 200,
seed = 1
)
resReturned values:
- bound – lower confidence limit for prediction performance
- tau – estimated tilting parameter
- t0 – empirical performance of the selected model
- selected_idx – index of the selected model
The procedure combines two ideas:
Multiplicity adjustment
Model selection creates a multiple comparison problem. The procedure
therefore uses a max-type calibration across candidate models.
Bootstrap tilting
Bootstrap resampling is modified using weights
w_i(tau) ∝ exp(tau * psi_i)
where psi_i is an empirical influence quantity and tau is a tilting
parameter chosen so that the bootstrap distribution matches the target
significance level.
For methodological background see
vignette("methodological-background", package = "pppms")Rink, P. (2025).
Confidence Limits for Prediction Performance.
Doctoral thesis, University of Bremen.