{ "cells": [ { "cell_type": "markdown", "id": "721d9059", "metadata": {}, "source": [ " # Quickstart\n", "\n", "\n", " {download}`⬇️ Download this Notebook <./quickstart.ipynb>`\n", "\n", " ## KhalibClassifier Scikit-Learn Estimator\n", "\n", " We first create our train, calibration and test datasets. We use 45k as test to\n", " ensure good error estimations. The rest is divided evenly into train and calibration." ] }, { "cell_type": "code", "execution_count": null, "id": "1c5a14fe", "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import make_classification\n", "from sklearn.model_selection import train_test_split\n", "\n", "X, y = make_classification(\n", " n_samples=50_000, n_features=20, n_informative=2, n_redundant=10, random_state=42\n", ")\n", "X_train, X_not_train, y_train, y_not_train = train_test_split(\n", " X, y, train_size=2500, random_state=42\n", ")\n", "X_calib, X_test, y_calib, y_test = train_test_split(\n", " X_not_train, y_not_train, train_size=2500, random_state=42\n", ")" ] }, { "cell_type": "markdown", "id": "11e47c79", "metadata": {}, "source": [ "We now train a `GaussianNB` classifier. This kind of model is usually uncalibrated\n", "because data never fullfill its hypotheses. We also estimate its expected calibration\n", "error (ECE):" ] }, { "cell_type": "code", "execution_count": null, "id": "24800655", "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "from sklearn.naive_bayes import GaussianNB\n", "\n", "import khalib\n", "\n", "# Compute the positive scores with a Gaussian Naive Bayes model\n", "gnb = GaussianNB()\n", "gnb.fit(X_train, y_train)\n", "y_scores_test = gnb.predict_proba(X_test)[:, 1]\n", "\n", "# Compute and display the ECE\n", "ece_test = khalib.calibration_error(y_scores_test, y_test)\n", "print(\"RAW GNB ECE:\", ece_test)" ] }, { "cell_type": "markdown", "id": "e761991c", "metadata": {}, "source": [ "To calibrate our `GaussianNB` we create an instance of `KhalibClassifier` with it:" ] }, { "cell_type": "markdown", "id": "fefe95a7", "metadata": {}, "source": [ "We can also plot the reliability diagram using the `build_reliability_diagram`\n", "function:" ] }, { "cell_type": "code", "execution_count": null, "id": "3ed2da8f", "metadata": {}, "outputs": [], "source": [ "%config InlineBackend.figure_formats = ['svg']\n", "_ = khalib.build_reliability_diagram(y_scores_test, y_test)" ] }, { "cell_type": "markdown", "id": "faf3526f", "metadata": {}, "source": [ "We now calibrate the model with a `KhalibClassifier` object. It uses the uncalibrated\n", "model as parameter. We then `fit` it on the `calib` split." ] }, { "cell_type": "code", "execution_count": null, "id": "c4ab538f", "metadata": {}, "outputs": [], "source": [ "# Train the calibrated classifier and obtain the calibrated scores\n", "calib_gnb = khalib.KhalibClassifier(gnb)\n", "calib_gnb.fit(X_calib, y_calib)\n", "y_calib_scores_test = calib_gnb.predict_proba(X_test)[:, 1]\n", "\n", "# Compute the ECE\n", "calib_ece_test = khalib.calibration_error(y_calib_scores_test, y_test)\n", "print(\"CALIB ECE:\", calib_ece_test)\n", "print(\"Reduction:\", (ece_test - calib_ece_test) / ece_test)" ] }, { "cell_type": "markdown", "id": "081da17b", "metadata": {}, "source": [ "We observe that `khalib` reduced the ECE by ~90%. We now plot the reliability diagram\n", "for the calibrated scores. The `reliability_diagram` uses a heuristic to detect when\n", "the scores are distributed as Dirac deltas and changes the visualization accordingly:" ] }, { "cell_type": "code", "execution_count": null, "id": "1bcade36", "metadata": {}, "outputs": [], "source": [ "_ = khalib.build_reliability_diagram(y_calib_scores_test, y_test)" ] }, { "cell_type": "markdown", "id": "b136103a", "metadata": {}, "source": [ "## calibrate_binary function + Histogram class\n", "\n", "We can achieve the same result \"manually\" by using the function `calibrate_binary`\n", "which calibrates the scores with a `Histogram` object." ] }, { "cell_type": "code", "execution_count": null, "id": "75463478", "metadata": {}, "outputs": [], "source": [ "# Obtain the scores on the calib split and build a supervised histogram with it\n", "y_scores_calib = gnb.predict_proba(X_calib)[:, 1]\n", "hist = khalib.Histogram.from_data(y_scores_calib, y=y_calib)\n", "\n", "# Calibrate the scores of the test split\n", "calib_hist_y_test_scores = khalib.calibrate_binary(\n", " y_scores_test, hist, only_positive=True\n", ")\n", "\n", "# Print the error and plot the reliability diagram\n", "calib_hist_ece_test = khalib.calibration_error(calib_hist_y_test_scores, y=y_test)\n", "print(\"CALIB HIST ECE:\", calib_hist_ece_test)\n", "print(\"Reduction :\", (ece_test - calib_ece_test) / ece_test)\n", "_ = khalib.build_reliability_diagram(calib_hist_y_test_scores, y_test)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 5 }