API Reference¶
- class khalib.KhalibClassifier(estimator=None, train_size=0.75, random_state=None)¶
Bases:
ClassifierMixin
,MetaEstimatorMixin
,BaseEstimator
Classifier calibration via Khiops
This estimator uses the Khiops regularized histogram construction method (MODL) to calibrate the probabilities/scores coming from a classifier.
For binary classifiers, it fits a Khiops histogram for the scores for the positive class. Then, for a given score, it estimates its calibrated probability as the positive class probability of the bin in which the score falls.
For multi-class classifiers, it fits khiops histogram for each class in a one-vs-rest fashion. Then, given a score vector, it estimates its calibrated probabilities by applying in each one-vs-rest the binary method described above and then normalizing to one.
This class emulates closely the
CalibratedClassifierCV
interface but without cross-validation, as the Khiops models are regularized and do not need it.- Parameters:
- estimatorestimator instance, default=None
The classifier whose output need to be calibrated to provide more accurate
predict_proba
outputs. If not provided, it trains a KhiopsClassifier. If provided, but the estimator is not fitted then, onfit
, it will be fitted withtrain_size
elements.- train_sizefloat or int, default=0.75
Same parameter as
train_test_split
. Only used whenestimator
is not provided or not fitted.- random_stateint, default=None
Same parameter as
train_test_split
. Only used whenestimator
is not provided or not fitted.
- Attributes:
- fitted_estimator_
The underlying fitted classifier. It is the same instance as
estimator
if it was provided, otherwise it is an instance of KhiopsClassifier.- histograms_list of size 1 or n_classes
In the binary case: a list of size 1 with the
Histogram
instance for the positive class. In the multi class case: A list of size n_classes containing the one-vs-restHistogram
for each class.
- fit(X, y)¶
Fits the calibrated model
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Training data.
- yarray-like of shape (n_samples,)
Target values.
- Returns:
- self
The calling estimator instance.
- predict_proba(X)¶
Estimates the calibrated classification probabilities
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Training data.
- Returns:
- Parray-like of shape (n_samples, n_classes)
The estimated calibrated probabilities for each class.
- khalib.calibration_error(y_scores, y, method: str = 'label-bin', histogram_method: str = 'khiops', max_bins: int = 0, multi_class_method: str = 'top-label', histogram: Histogram | None = None)¶
Estimates the ECE via binning
The default binning method “khiops” finds a regularized histogram for which the distribution of the target as pure as possible.
- Parameters:
- y_scoresarray-like of shape (n_samples,) or (n_samples, 1) or (n_samples, 2)
Scores/probabilities. If it is a 2-D array the column indexed as 1 will be used for the estimation.
- yarray-like of shape (n_samples,) or (n_samples, n_classes)
Target values (class labels).
- method{“label-bin”, “bin”}, default=”label-bin”
ECE estimation method. See below for details.
- multi_class_method{“top-label”, “classwise”}, default=”top-label”
Multi-class ECE estimation method:
“top-label”: Estimates the ECE for the confidence scores (ie. the predicted class score).
“classwise”: Estimates the ECE for each class in a one-vs-rest fashion and the averages it.
- histogram_method{“khiops”, “eq-freq”, “eq-width”}, default=”khiops”
Histogram method:
“khiops”: A non-parametric regularized histogram method. It finds the best histogram such that the distribution of the target is constant in each bin.
“eq-freq”: All bins have the same number of elements. If many instances have too many values the algorithm will put it in its own bin, which will be larger than the other ones.
“eq-width”: All bins have the same width.
If the method is set to “eq-freq” or “eq-width” is set then ‘y’ is ignored.
- max_binsint, default=0
The maximum number of bins to be created. The algorithms usually create this number of bins but they may create less. The default value 0 means:
For “khiops”: that there is no limit to the number of intervals.
For “eq-freq” or “eq-width”: that 10 is the maximum number of intervals.
- histogram
Histogram
, optional A ready-made histogram. If set then it is used for the ECE computation and the parameters histogram_method and max_bins are ignored.
Notes
We present the formulas for the two binary ECE estimators used in this function, as described in [RCSM22]. Both approximate the theoretical calibration error:
\[\mathbb{E}\left[ \left| \mathbb{P}\left[ Y = 1 | S \right] - S \right| \right]\]where the pair \((S, Y)\) is a random vector taking values in \([0, 1] \times \lbrace{0, 1\rbrace}\). The random variable \(S\) represent the (normalized) scores coming from a classifier, and \(Y\) the target.
Let \(\lbrace {B_i} \rbrace _{i=I}^I\) a binning of \([0,1]\) and \(\lbrace (s_n, y_n) \rbrace_{n=1}^N\) a sample following the distribution of \((S, Y)\).
The label-bin ECE estimation is given by:
\[\text{ECE}_{\text{lb}} = \frac{1}{N} \sum_{n=1}^N \sum_{i = 1}^I \mathbb{1}_{s_n \in B_i} \left| s_n - \bar{y_i} \right|\]where \(\bar{y_i}\) is the sample conditional expectation of \(Y\) in the \(i\)-th bin:
\[\frac{\sum_{n=1}^N \mathbb{1}_{s_n \in B_i} \cdot y_n}{|B_i|}.\]On the other hand, the bin ECE estimation is given by:
\[\text{ECE}_{\text{bin}} = \sum_{i = 1}^I \frac{\left| B_i \right| }{N} \left| \bar{s_i} - \bar{y_i} \right|\]where \(\bar{y_i}\) is as above, and \(\bar{s_i}\) is the sample conditional expectation of \(S\) in the \(i\)-th bin:
\[\frac{\sum_{n=1}^N \mathbb{1}_{s_n \in B_i} \cdot s_n}{|B_i|}.\]The bin estimator is a lower bound for the real ECE for any binning used. And, for a given binning, it is a lower bound of the label-bin estimator.
The bin estimator is the most common literature, usually called the plugin estimator, or even shown as the defitinion ECE. Despite this fact, we choose the label-bin estimator as default method because it is exact when the conditional distribution \(\mathbb{P}\left[ Y = 1 | S = s \right]\) is piecewise constant on \(s\). These piecewise constant models are exactly the hypothesis space of any histogram estimation of the ECE.
- khalib.build_reliability_diagram(y_scores, y, dirac_threshold=1e-06, log_plot_threshold=3.0, min_density_bar_width=0.0025)¶
Builds a reliability diagram with the target score distribution below
To build the diagram this function uses Khiops supervised histograms. To build the score distribution, it uses Khiops unsupervised histograms. Using the latter, it implements a heuristic to dectect when the scores is distributed as a sum of diracs and changes visualization accordingly.
- Parameters:
- y_scoresarray-like of shape (n_samples,) or (n_samples, 1) or (n_samples, 2)
Scores/probabilities. If it is a 2-D array the column indexed as 1 will be used for the estimation.
- yarray-like of shape (n_samples,) or (n_samples, n_classes)
Target values (class labels).
- dirac_thresholdfloat, default=1.0e-06
If a bin in the scores’ unsupervised histogram is lower than this ‘dirac_threshold’ then it is considered a dirac mass.
- log_plot_thresholdfloat, default=3.0
Density plot only: If the log-difference between the maximal and minimal positive density values is larger than ‘log_plot_threshold’ then the density plot uses a log scale in the y-axis.
- min_density_bar_widthfloat, default=5.0e-03
Density plot only: If a bin of the scores’ unsupervised histogram has a width lower than ‘min_density_bar_width’ then it is plotted as having a width of ‘min_density_bar_width’.
- Returns:
- tuple
A 2-tuple containing:
A
dict
containing twomatplotlib.axes.Axes
with keys, “reliability diagram” and “score_distribution”.
- class khalib.Histogram(breakpoints: list[float], freqs: list[int], target_freqs: list[tuple] = <factory>, classes: list = <factory>)¶
Bases:
object
A histogram with optional target variable statistics
Note
To obtain instances from real data, prefer the factory method
from_data
rather than the base constructor.- Attributes:
- breakpointslist[float]
The breakpoints defining the histogram in increasing order.
- freqslist of float
The frequencies of each bin.
- densitieslist[float]
The densities of each bin.
- classeslist
The class labels.
- target_freqslist[tuple], optional
The target frequencies in each bin. Each tuple is has the same size as
classes
.- target_probaslist[tuple], optional
The target conditional probabilites in each bin. Each tuple is has the same size as
classes
.
- classmethod from_data(x, y=None, method: str = 'khiops', max_bins: int = 0, use_finest: bool = False) Histogram ¶
Computes a histogram of an 1D vector via Khiops
- Parameters:
- xarray-like of shape (n_samples,) or (n_samples, 1)
Input scores.
- yarray-like of shape (n_samples,) or (n_samples, 1), optional
Target values.
- method{“khiops”, “eq-freq”, “eq-width”}, default=”khiops”
Histogram method:
“khiops”: A non-parametric regularized histogram method.
“eq-freq”: All bins have the same number of elements. If many instances have too many values the algorithm will put it in its own bin, which will be larger than the other ones.
“eq-width”: All bins have the same width.
If the method is set to “eq-freq” or “eq-width” is set then ‘y’ is ignored.
- max_bins: int, default=0
The maximum number of bins to be created. The algorithms usually create this number of bins but they may create less. The default value 0 means:
For “khiops”: that there is no limit to the number of intervals.
For “eq-freq” or “eq-width”: that 10 is the maximum number of intervals.
- use_finest: bool, default=False
Unsupervised ‘khiops’ histogram only: If
True
it builds the finest histogram instead of the most interpretable.
- Returns:
Histogram
The histogram object containing the bin limits and frequencies.
- classmethod from_data_and_breakpoints(x, breakpoints: list[float], y=None) Histogram ¶
Builds a histogram from a list of breakpoints and data
- Parameters:
- xarray-like of shape (n_samples,) or (n_samples, 1)
Vector with the values to discretize for the histogram.
- breakpointslist[float]
A sorted list of floats defining the bin edges.
- yarray-like of shape (n_samples,) or (n_samples, 1), optional
Target values associated to each element in ‘x’.
- find(value: float) int ¶
Returns the histogram bin index for a value
- Parameters:
- valuefloat
The value to look up in the histogram bins.
- Returns:
- int
The index of the bin containing the value.
- vfind(values)¶
Returns the histogram bin indexes for a value sequence
- Parameters:
- valuearray-like
The values to look up in the histogram bins.
- Returns:
- array-like
The indexes of the bins containing the values.
- khalib.calibrate_binary(y_scores, histogram: Histogram, only_positive: bool = False)¶
Calibrates a binary score with an histogram
This function may be used to calibrate binary classification scores without the use of the
KhalibClassifier
estimator.- Parameters:
- y_scoresarray-like of shape (n_samples,) or (n_samples, 1)
Scores for the positive class.
- histogram
Histogram
A histogram with target statistics (ie.
Histogram.target_freqs
must be non-empty).- only_positivebool, default=False
If
True
it returns only 1-D array with the probabilities of the positive class. Otherwise it returns a 2-D array with the probabilities for both classes.
References
Roelofs, R., Cain, N., Shlens, J. & Mozer, M.C.. (2022). Mitigating Bias in Calibration Error Estimation . <i>Proceedings of The 25th International Conference on Artificial Intelligence and Statistics</i>, in <i>Proceedings of Machine Learning Research</i> 151:4036-4054 Available from https://proceedings.mlr.press/v151/roelofs22a.html.