API Reference

class khalib.KhalibClassifier(estimator=None, train_size=0.75, random_state=None)

Bases: ClassifierMixin, MetaEstimatorMixin, BaseEstimator

Classifier calibration via Khiops

This estimator uses the Khiops regularized histogram construction method (MODL) to calibrate the probabilities/scores coming from a classifier.

For binary classifiers, it fits a Khiops histogram for the scores for the positive class. Then, for a given score, it estimates its calibrated probability as the positive class probability of the bin in which the score falls.

For multi-class classifiers, it fits khiops histogram for each class in a one-vs-rest fashion. Then, given a score vector, it estimates its calibrated probabilities by applying in each one-vs-rest the binary method described above and then normalizing to one.

This class emulates closely the CalibratedClassifierCV interface but without cross-validation, as the Khiops models are regularized and do not need it.

Parameters:
estimatorestimator instance, default=None

The classifier whose output need to be calibrated to provide more accurate predict_proba outputs. If not provided, it trains a KhiopsClassifier. If provided, but the estimator is not fitted then, on fit, it will be fitted with train_size elements.

train_sizefloat or int, default=0.75

Same parameter as train_test_split. Only used when estimator is not provided or not fitted.

random_stateint, default=None

Same parameter as train_test_split. Only used when estimator is not provided or not fitted.

Attributes:
fitted_estimator_

The underlying fitted classifier. It is the same instance as estimator if it was provided, otherwise it is an instance of KhiopsClassifier.

histograms_list of size 1 or n_classes

In the binary case: a list of size 1 with the Histogram instance for the positive class. In the multi class case: A list of size n_classes containing the one-vs-rest Histogram for each class.

fit(X, y)

Fits the calibrated model

Parameters:
Xarray-like of shape (n_samples, n_features)

Training data.

yarray-like of shape (n_samples,)

Target values.

Returns:
self

The calling estimator instance.

predict_proba(X)

Estimates the calibrated classification probabilities

Parameters:
Xarray-like of shape (n_samples, n_features)

Training data.

Returns:
Parray-like of shape (n_samples, n_classes)

The estimated calibrated probabilities for each class.

khalib.calibration_error(y_scores, y, method: str = 'label-bin', histogram_method: str = 'khiops', max_bins: int = 0, multi_class_method: str = 'top-label', histogram: Histogram | None = None)

Estimates the ECE via binning

The default binning method “khiops” finds a regularized histogram for which the distribution of the target as pure as possible.

Parameters:
y_scoresarray-like of shape (n_samples,) or (n_samples, 1) or (n_samples, 2)

Scores/probabilities. If it is a 2-D array the column indexed as 1 will be used for the estimation.

yarray-like of shape (n_samples,) or (n_samples, n_classes)

Target values (class labels).

method{“label-bin”, “bin”}, default=”label-bin”

ECE estimation method. See below for details.

multi_class_method{“top-label”, “classwise”}, default=”top-label”

Multi-class ECE estimation method:

  • “top-label”: Estimates the ECE for the confidence scores (ie. the predicted class score).

  • “classwise”: Estimates the ECE for each class in a one-vs-rest fashion and the averages it.

histogram_method{“khiops”, “eq-freq”, “eq-width”}, default=”khiops”

Histogram method:

  • “khiops”: A non-parametric regularized histogram method. It finds the best histogram such that the distribution of the target is constant in each bin.

  • “eq-freq”: All bins have the same number of elements. If many instances have too many values the algorithm will put it in its own bin, which will be larger than the other ones.

  • “eq-width”: All bins have the same width.

If the method is set to “eq-freq” or “eq-width” is set then ‘y’ is ignored.

max_binsint, default=0

The maximum number of bins to be created. The algorithms usually create this number of bins but they may create less. The default value 0 means:

  • For “khiops”: that there is no limit to the number of intervals.

  • For “eq-freq” or “eq-width”: that 10 is the maximum number of intervals.

histogramHistogram, optional

A ready-made histogram. If set then it is used for the ECE computation and the parameters histogram_method and max_bins are ignored.

Notes

We present the formulas for the two binary ECE estimators used in this function, as described in [RCSM22]. Both approximate the theoretical calibration error:

\[\mathbb{E}\left[ \left| \mathbb{P}\left[ Y = 1 | S \right] - S \right| \right]\]

where the pair \((S, Y)\) is a random vector taking values in \([0, 1] \times \lbrace{0, 1\rbrace}\). The random variable \(S\) represent the (normalized) scores coming from a classifier, and \(Y\) the target.

Let \(\lbrace {B_i} \rbrace _{i=I}^I\) a binning of \([0,1]\) and \(\lbrace (s_n, y_n) \rbrace_{n=1}^N\) a sample following the distribution of \((S, Y)\).

The label-bin ECE estimation is given by:

\[\text{ECE}_{\text{lb}} = \frac{1}{N} \sum_{n=1}^N \sum_{i = 1}^I \mathbb{1}_{s_n \in B_i} \left| s_n - \bar{y_i} \right|\]

where \(\bar{y_i}\) is the sample conditional expectation of \(Y\) in the \(i\)-th bin:

\[\frac{\sum_{n=1}^N \mathbb{1}_{s_n \in B_i} \cdot y_n}{|B_i|}.\]

On the other hand, the bin ECE estimation is given by:

\[\text{ECE}_{\text{bin}} = \sum_{i = 1}^I \frac{\left| B_i \right| }{N} \left| \bar{s_i} - \bar{y_i} \right|\]

where \(\bar{y_i}\) is as above, and \(\bar{s_i}\) is the sample conditional expectation of \(S\) in the \(i\)-th bin:

\[\frac{\sum_{n=1}^N \mathbb{1}_{s_n \in B_i} \cdot s_n}{|B_i|}.\]

The bin estimator is a lower bound for the real ECE for any binning used. And, for a given binning, it is a lower bound of the label-bin estimator.

The bin estimator is the most common literature, usually called the plugin estimator, or even shown as the defitinion ECE. Despite this fact, we choose the label-bin estimator as default method because it is exact when the conditional distribution \(\mathbb{P}\left[ Y = 1 | S = s \right]\) is piecewise constant on \(s\). These piecewise constant models are exactly the hypothesis space of any histogram estimation of the ECE.

khalib.build_reliability_diagram(y_scores, y, dirac_threshold=1e-06, log_plot_threshold=3.0, min_density_bar_width=0.0025)

Builds a reliability diagram with the target score distribution below

To build the diagram this function uses Khiops supervised histograms. To build the score distribution, it uses Khiops unsupervised histograms. Using the latter, it implements a heuristic to dectect when the scores is distributed as a sum of diracs and changes visualization accordingly.

Parameters:
y_scoresarray-like of shape (n_samples,) or (n_samples, 1) or (n_samples, 2)

Scores/probabilities. If it is a 2-D array the column indexed as 1 will be used for the estimation.

yarray-like of shape (n_samples,) or (n_samples, n_classes)

Target values (class labels).

dirac_thresholdfloat, default=1.0e-06

If a bin in the scores’ unsupervised histogram is lower than this ‘dirac_threshold’ then it is considered a dirac mass.

log_plot_thresholdfloat, default=3.0

Density plot only: If the log-difference between the maximal and minimal positive density values is larger than ‘log_plot_threshold’ then the density plot uses a log scale in the y-axis.

min_density_bar_widthfloat, default=5.0e-03

Density plot only: If a bin of the scores’ unsupervised histogram has a width lower than ‘min_density_bar_width’ then it is plotted as having a width of ‘min_density_bar_width’.

Returns:
tuple

A 2-tuple containing:

class khalib.Histogram(breakpoints: list[float], freqs: list[int], target_freqs: list[tuple] = <factory>, classes: list = <factory>)

Bases: object

A histogram with optional target variable statistics

Note

To obtain instances from real data, prefer the factory method from_data rather than the base constructor.

Attributes:
breakpointslist[float]

The breakpoints defining the histogram in increasing order.

freqslist of float

The frequencies of each bin.

densitieslist[float]

The densities of each bin.

classeslist

The class labels.

target_freqslist[tuple], optional

The target frequencies in each bin. Each tuple is has the same size as classes.

target_probaslist[tuple], optional

The target conditional probabilites in each bin. Each tuple is has the same size as classes.

property n_bins: int

Number of histogram bins.

property bins: list[tuple]

List of histogram bins.

property classes_type: type | None

Type of the target classes (only for histograms built with y).

classmethod from_data(x, y=None, method: str = 'khiops', max_bins: int = 0, use_finest: bool = False) Histogram

Computes a histogram of an 1D vector via Khiops

Parameters:
xarray-like of shape (n_samples,) or (n_samples, 1)

Input scores.

yarray-like of shape (n_samples,) or (n_samples, 1), optional

Target values.

method{“khiops”, “eq-freq”, “eq-width”}, default=”khiops”

Histogram method:

  • “khiops”: A non-parametric regularized histogram method.

  • “eq-freq”: All bins have the same number of elements. If many instances have too many values the algorithm will put it in its own bin, which will be larger than the other ones.

  • “eq-width”: All bins have the same width.

If the method is set to “eq-freq” or “eq-width” is set then ‘y’ is ignored.

max_bins: int, default=0

The maximum number of bins to be created. The algorithms usually create this number of bins but they may create less. The default value 0 means:

  • For “khiops”: that there is no limit to the number of intervals.

  • For “eq-freq” or “eq-width”: that 10 is the maximum number of intervals.

use_finest: bool, default=False

Unsupervised ‘khiops’ histogram only: If True it builds the finest histogram instead of the most interpretable.

Returns:
Histogram

The histogram object containing the bin limits and frequencies.

classmethod from_data_and_breakpoints(x, breakpoints: list[float], y=None) Histogram

Builds a histogram from a list of breakpoints and data

Parameters:
xarray-like of shape (n_samples,) or (n_samples, 1)

Vector with the values to discretize for the histogram.

breakpointslist[float]

A sorted list of floats defining the bin edges.

yarray-like of shape (n_samples,) or (n_samples, 1), optional

Target values associated to each element in ‘x’.

find(value: float) int

Returns the histogram bin index for a value

Parameters:
valuefloat

The value to look up in the histogram bins.

Returns:
int

The index of the bin containing the value.

vfind(values)

Returns the histogram bin indexes for a value sequence

Parameters:
valuearray-like

The values to look up in the histogram bins.

Returns:
array-like

The indexes of the bins containing the values.

khalib.calibrate_binary(y_scores, histogram: Histogram, only_positive: bool = False)

Calibrates a binary score with an histogram

This function may be used to calibrate binary classification scores without the use of the KhalibClassifier estimator.

Parameters:
y_scoresarray-like of shape (n_samples,) or (n_samples, 1)

Scores for the positive class.

histogramHistogram

A histogram with target statistics (ie. Histogram.target_freqs must be non-empty).

only_positivebool, default=False

If True it returns only 1-D array with the probabilities of the positive class. Otherwise it returns a 2-D array with the probabilities for both classes.

References

[RCSM22]

Roelofs, R., Cain, N., Shlens, J. &amp; Mozer, M.C.. (2022). Mitigating Bias in Calibration Error Estimation . <i>Proceedings of The 25th International Conference on Artificial Intelligence and Statistics</i>, in <i>Proceedings of Machine Learning Research</i> 151:4036-4054 Available from https://proceedings.mlr.press/v151/roelofs22a.html.