API Reference¶

class khalib.KhalibClassifier(estimator=None, train_size=0.75, random_state=None)¶

Bases: ClassifierMixin, MetaEstimatorMixin, BaseEstimator

Classifier calibration via Khiops

This estimator uses the Khiops regularized histogram construction method (MODL) to calibrate the probabilities/scores coming from a classifier.

For binary classifiers, it fits a Khiops histogram for the scores for the positive class. Then, for a given score, it estimates its calibrated probability as the positive class probability of the bin in which the score falls.

For multi-class classifiers, it fits khiops histogram for each class in a one-vs-rest fashion. Then, given a score vector, it estimates its calibrated probabilities by applying in each one-vs-rest the binary method described above and then normalizing to one.

This class emulates closely the CalibratedClassifierCV interface but without cross-validation, as the Khiops models are regularized and do not need it.

Parameters:

estimatorestimator instance, default=None: The classifier whose output need to be calibrated to provide more accurate predict_proba outputs. If not provided, it trains a KhiopsClassifier. If provided, but the estimator is not fitted then, on fit, it will be fitted with train_size elements.
train_sizefloat or int, default=0.75: Same parameter as train_test_split. Only used when estimator is not provided or not fitted.
random_stateint, default=None: Same parameter as train_test_split. Only used when estimator is not provided or not fitted.

Attributes:

fitted_estimator_: The underlying fitted classifier. It is the same instance as estimator if it was provided, otherwise it is an instance of KhiopsClassifier.
histograms_list of size 1 or n_classes: In the binary case: a list of size 1 with the Histogram instance for the positive class. In the multi class case: A list of size n_classes containing the one-vs-rest Histogram for each class.

fit(X, y)¶

Fits the calibrated model

Parameters:

Xarray-like of shape (n_samples, n_features): Training data.
yarray-like of shape (n_samples,): Target values.

Returns:

self: The calling estimator instance.

predict_proba(X)¶

Estimates the calibrated classification probabilities

Parameters:

Xarray-like of shape (n_samples, n_features): Training data.

Returns:

Parray-like of shape (n_samples, n_classes): The estimated calibrated probabilities for each class.

get_metadata_routing()¶

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)¶

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

score(X, y, sample_weight=None)¶

Return accuracy on provided data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

Xarray-like of shape (n_samples, n_features): Test samples.
yarray-like of shape (n_samples,) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns:

scorefloat: Mean accuracy of self.predict(X) w.r.t. y.

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.

khalib.calibration_error(y_scores, y, method: str = 'label-bin', histogram_method: str = 'khiops', max_bins: int = 0, multi_class_method: str = 'top-label', histogram: Histogram | None = None)¶

Estimates the ECE via binning

The default binning method “khiops” finds a regularized histogram for which the distribution of the target as pure as possible.

Parameters:

y_scoresarray-like of shape (n_samples,) or (n_samples, 1) or (n_samples, 2)

Scores/probabilities. If it is a 2-D array the column indexed as 1 will be used for the estimation.

yarray-like of shape (n_samples,) or (n_samples, n_classes)

Target values (class labels).

method{“label-bin”, “bin”}, default=”label-bin”

ECE estimation method. See below for details.

multi_class_method{“top-label”, “classwise”}, default=”top-label”

Multi-class ECE estimation method:

“top-label”: Estimates the ECE for the confidence scores (ie. the predicted class score).
“classwise”: Estimates the ECE for each class in a one-vs-rest fashion and the averages it.

histogram_method{“khiops”, “eq-freq”, “eq-width”}, default=”khiops”

Histogram method:

“khiops”: A non-parametric regularized histogram method. It finds the best histogram such that the distribution of the target is constant in each bin.
“eq-freq”: All bins have the same number of elements. If many instances have too many values the algorithm will put it in its own bin, which will be larger than the other ones.
“eq-width”: All bins have the same width.

If the method is set to “eq-freq” or “eq-width” is set then ‘y’ is ignored.

max_binsint, default=0

The maximum number of bins to be created. The algorithms usually create this number of bins but they may create less. The default value 0 means:

For “khiops”: that there is no limit to the number of intervals.
For “eq-freq” or “eq-width”: that 10 is the maximum number of intervals.

histogramHistogram, optional

A ready-made histogram. If set then it is used for the ECE computation and the parameters histogram_method and max_bins are ignored.

Notes

We present the formulas for the two binary ECE estimators used in this function, as described in [RCSM22]. Both approximate the theoretical calibration error:

\[\mathbb{E}\left[ \left| \mathbb{P}\left[ Y = 1 | S \right] - S \right| \right]\]

where the pair \((S, Y)\) is a random vector taking values in \([0, 1] \times \lbrace{0, 1\rbrace}\). The random variable \(S\) represent the (normalized) scores coming from a classifier, and \(Y\) the target.

Let \(\lbrace {B_i} \rbrace _{i=I}^I\) a binning of \([0,1]\) and \(\lbrace (s_n, y_n) \rbrace_{n=1}^N\) a sample following the distribution of \((S, Y)\).

The label-bin ECE estimation is given by:

\[\text{ECE}_{\text{lb}} = \frac{1}{N} \sum_{n=1}^N \sum_{i = 1}^I \mathbb{1}_{s_n \in B_i} \left| s_n - \bar{y_i} \right|\]

where \(\bar{y_i}\) is the sample conditional expectation of \(Y\) in the \(i\)-th bin:

\[\frac{\sum_{n=1}^N \mathbb{1}_{s_n \in B_i} \cdot y_n}{|B_i|}.\]

On the other hand, the bin ECE estimation is given by:

\[\text{ECE}_{\text{bin}} = \sum_{i = 1}^I \frac{\left| B_i \right| }{N} \left| \bar{s_i} - \bar{y_i} \right|\]

where \(\bar{y_i}\) is as above, and \(\bar{s_i}\) is the sample conditional expectation of \(S\) in the \(i\)-th bin:

\[\frac{\sum_{n=1}^N \mathbb{1}_{s_n \in B_i} \cdot s_n}{|B_i|}.\]

The bin estimator is a lower bound for the real ECE for any binning used. And, for a given binning, it is a lower bound of the label-bin estimator.

The bin estimator is the most common literature, usually called the plugin estimator, or even shown as the defitinion ECE. Despite this fact, we choose the label-bin estimator as default method because it is exact when the conditional distribution \(\mathbb{P}\left[ Y = 1 | S = s \right]\) is piecewise constant on \(s\). These piecewise constant models are exactly the hypothesis space of any histogram estimation of the ECE.

khalib.build_reliability_diagram(y_scores, y, dirac_threshold=1e-06, log_plot_threshold=3.0, min_density_bar_width=0.0025)¶

Builds a reliability diagram with the target score distribution below

To build the diagram this function uses Khiops supervised histograms. To build the score distribution, it uses Khiops unsupervised histograms. Using the latter, it implements a heuristic to dectect when the scores is distributed as a sum of diracs and changes visualization accordingly.

Parameters:

y_scoresarray-like of shape (n_samples,) or (n_samples, 1) or (n_samples, 2): Scores/probabilities. If it is a 2-D array the column indexed as 1 will be used for the estimation.
yarray-like of shape (n_samples,) or (n_samples, n_classes): Target values (class labels).
dirac_thresholdfloat, default=1.0e-06: If a bin in the scores’ unsupervised histogram is lower than this ‘dirac_threshold’ then it is considered a dirac mass.
log_plot_thresholdfloat, default=3.0: Density plot only: If the log-difference between the maximal and minimal positive density values is larger than ‘log_plot_threshold’ then the density plot uses a log scale in the y-axis.
min_density_bar_widthfloat, default=5.0e-03: Density plot only: If a bin of the scores’ unsupervised histogram has a width lower than ‘min_density_bar_width’ then it is plotted as having a width of ‘min_density_bar_width’.

Returns:

tuple

A 2-tuple containing:

A matplotlib.figure.Figure.
A dict containing two matplotlib.axes.Axes with keys, “reliability diagram” and “score_distribution”.

class khalib.Histogram(breakpoints: list[float], freqs: list[int], target_freqs: list[tuple] = <factory>, classes: list = <factory>)¶

Bases: object

A histogram with optional target variable statistics

Note

To obtain instances from real data, prefer the factory method from_data rather than the base constructor.

Attributes:

breakpointslist[float]: The breakpoints defining the histogram in increasing order.
freqslist of float: The frequencies of each bin.
densitieslist[float]: The densities of each bin.
classeslist: The class labels.
target_freqslist[tuple], optional: The target frequencies in each bin. Each tuple is has the same size as classes.
target_probaslist[tuple], optional: The target conditional probabilites in each bin. Each tuple is has the same size as classes.

property n_bins: int¶: Number of histogram bins.

property bins: list[tuple]¶: List of histogram bins.

property classes_type: type | None¶: Type of the target classes (only for histograms built with y).

classmethod from_data(x, y=None, method: str = 'khiops', max_bins: int = 0, use_finest: bool = False) → Histogram¶

Computes a histogram of an 1D vector via Khiops

Parameters:

xarray-like of shape (n_samples,) or (n_samples, 1)

Input scores.

yarray-like of shape (n_samples,) or (n_samples, 1), optional

Target values.

method{“khiops”, “eq-freq”, “eq-width”}, default=”khiops”

Histogram method:

“khiops”: A non-parametric regularized histogram method.
“eq-freq”: All bins have the same number of elements. If many instances have too many values the algorithm will put it in its own bin, which will be larger than the other ones.
“eq-width”: All bins have the same width.

If the method is set to “eq-freq” or “eq-width” is set then ‘y’ is ignored.

max_bins: int, default=0

The maximum number of bins to be created. The algorithms usually create this number of bins but they may create less. The default value 0 means:

For “khiops”: that there is no limit to the number of intervals.
For “eq-freq” or “eq-width”: that 10 is the maximum number of intervals.

use_finest: bool, default=False

Unsupervised ‘khiops’ histogram only: If True it builds the finest histogram instead of the most interpretable.

Returns:

Histogram: The histogram object containing the bin limits and frequencies.

classmethod from_data_and_breakpoints(x, breakpoints: list[float], y=None) → Histogram¶

Builds a histogram from a list of breakpoints and data

Parameters:

xarray-like of shape (n_samples,) or (n_samples, 1): Vector with the values to discretize for the histogram.
breakpointslist[float]: A sorted list of floats defining the bin edges.
yarray-like of shape (n_samples,) or (n_samples, 1), optional: Target values associated to each element in ‘x’.

find(value: float) → int¶

Returns the histogram bin index for a value

Parameters:

valuefloat: The value to look up in the histogram bins.

Returns:

int: The index of the bin containing the value.

vfind(values)¶

Returns the histogram bin indexes for a value sequence

Parameters:

valuearray-like: The values to look up in the histogram bins.

Returns:

array-like: The indexes of the bins containing the values.

khalib.calibrate_binary(y_scores, histogram: Histogram, only_positive: bool = False)¶

Calibrates a binary score with an histogram

This function may be used to calibrate binary classification scores without the use of the KhalibClassifier estimator.

Parameters:

y_scoresarray-like of shape (n_samples,) or (n_samples, 1): Scores for the positive class.
histogramHistogram: A histogram with target statistics (ie. Histogram.target_freqs must be non-empty).
only_positivebool, default=False: If True it returns only 1-D array with the probabilities of the positive class. Otherwise it returns a 2-D array with the probabilities for both classes.

References

[RCSM22]

Roelofs, R., Cain, N., Shlens, J. & Mozer, M.C.. (2022). Mitigating Bias in Calibration Error Estimation . <i>Proceedings of The 25th International Conference on Artificial Intelligence and Statistics</i>, in <i>Proceedings of Machine Learning Research</i> 151:4036-4054 Available from https://proceedings.mlr.press/v151/roelofs22a.html.