Propensity Score Estimation Model

class caliber.ood.propensity_score.PropensityScoreEstimationModel(binary_classification_estimation_model=XGBClassifier(objective='binary:logistic'), binary_classification_calibration_model=BetaBinaryClassificationModel(), calib_frac=0.5, clip_range=(0.05, 0.95), seed=0)[source]

A propensity score estimation model. Given inputs from a source and a target distribution, and targets describing which distributions they belong to, the model fits and then calibrates the probability that the input belongs to the distribution with target label equal 1. Finally, it predicts the propensity score as an odds ratio, clipped to make sure it does not explode.

Args:
binary_classification_estimation_model (_type_, optional): A binary classification model to classify inputs to source and

target distributions. The model needs to include a fit and a predict_proba methods, with analogous signature as in standard Scikit-Learn binary classification models. Defaults to XGBClassifier(objective=’binary:logistic’).

binary_classification_calibration_model (AbstractBinaryClassificationModel | None, optional): A binary classification model

to calibrate the probability returned by the estimation model. Defaults to BetaBinaryClassificationModel().

calib_frac (float, optional): The fraction of the data reserved for calibration. Defaults to 0.5. clip_range (tuple[float, float], optional): The range to clip the propensity score within. Defaults to (0.05, 0.95). seed (int, optional): a random seed. Defaults to 0.

fit(X, y)[source]

Fits the propensity score model.

Return type:

None

Args:

X (NDArray[np.float64]): Inputs from source and target distributions. y (NDArray[np.int64]): Binary targets describing which distribution an input belongs to.

predict(X)[source]

Predicts the propensity score.

Return type:

ndarray[tuple[int, ...], dtype[float64]]

Args:

X (NDArray[np.float64]): Test inputs.

Returns:

NDArray[np.float64]: the estimated propensity score for each input.