{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2D Feature Effects\n",
    "2D ALE and PD plots reveal how pairs of features jointly affect predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "import skexplain\n",
    "import plotting_config\n",
    "from matplotlib.ticker import MaxNLocator"
   ],
   "outputs": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Load the training data and pre-fit models\n",
    "estimators = skexplain.load_models()\n",
    "X, y = skexplain.load_data()\n",
    "X = X.astype({'urban': 'category', 'rural': 'category'})\n",
    "\n",
    "explainer = skexplain.ExplainToolkit(estimators, X=X, y=y)\n",
    "\n",
    "explainer.set_plotting_config(\n",
    "    display_feature_names=plotting_config.display_feature_names,\n",
    "    display_units=plotting_config.display_units,\n",
    ")"
   ],
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2D ALE\n",
    "\n",
    "2D ALE measures the additional contribution from the combination of two features, minus their respective 1D effects. If there is no multiplicative interaction between two features, then the 2D ALE is zero everywhere."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "features = [('temp2m', 'sfc_temp'), ('dwpt2m', 'sfc_temp'), ('temp2m', 'dwpt2m')]\n",
    "\n",
    "ale_2d_ds = explainer.ale(\n",
    "    features=features,\n",
    "    n_bootstrap=1,\n",
    "    subsample=1.0,\n",
    "    n_jobs=len(features) * len(estimators),\n",
    "    n_bins=30,\n",
    ")"
   ],
   "outputs": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Default plot includes KDE contours and scatter overlays\n",
    "fig, axes = explainer.plot_ale(ale=ale_2d_ds)"
   ],
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customizing 2D Plots\n",
    "\n",
    "The marginal distributions, scatter points, and KDE contours can be toggled on or off."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Recompute for a single feature pair for cleaner examples\n",
    "features_single = [('temp2m', 'sfc_temp')]\n",
    "\n",
    "ale_2d_single = explainer.ale(\n",
    "    features=features_single,\n",
    "    n_bootstrap=1,\n",
    "    subsample=1.0,\n",
    "    n_jobs=len(features_single) * len(estimators),\n",
    "    n_bins=30,\n",
    ")\n",
    "\n",
    "# Plot without KDE or scatter overlays\n",
    "fig, axes = explainer.plot_ale(\n",
    "    ale=ale_2d_single,\n",
    "    kde_curve=False,\n",
    "    scatter=False,\n",
    ")"
   ],
   "outputs": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Plot as smoothed filled contours instead of pixel-style image\n",
    "fig, axes = explainer.plot_ale(\n",
    "    ale=ale_2d_single,\n",
    "    kde_curve=False,\n",
    "    scatter=False,\n",
    "    contours=True,\n",
    ")"
   ],
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Custom Colorbar and Sizing\n",
    "\n",
    "You can control the colorbar, figure size, font size, and number of histogram bins."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "cbar_kwargs = {'extend': 'neither', 'ticks': MaxNLocator(3)}\n",
    "\n",
    "fig, axes = explainer.plot_ale(\n",
    "    ale=ale_2d_single,\n",
    "    kde_curves=False,\n",
    "    scatter=False,\n",
    "    figsize=(10, 5),\n",
    "    fontsize=10,\n",
    "    cbar_kwargs=cbar_kwargs,\n",
    "    bins=7,\n",
    ")"
   ],
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2D Partial Dependence\n",
    "\n",
    "2D PD shows the joint marginal effect of two features. Unlike 2D ALE, it does not subtract the 1D effects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "features_2d = [('temp2m', 'sfc_temp'), ('dwpt2m', 'sfc_temp'), ('temp2m', 'dwpt2m')]\n",
    "\n",
    "pd_2d_ds = explainer.pd(\n",
    "    features=features_2d,\n",
    "    n_bootstrap=1,\n",
    "    subsample=1000,\n",
    "    n_jobs=len(features_2d) * 3,\n",
    "    n_bins=10,\n",
    ")\n",
    "\n",
    "fig, axes = explainer.plot_pd(pd=pd_2d_ds)"
   ],
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2D ALE vs PD\n",
    "\n",
    "Both 2D ALE and 2D PD can reveal feature interactions, but they differ in important ways:\n",
    "\n",
    "- **2D ALE** isolates the pure interaction effect by subtracting the 1D effects. It accounts for feature correlations by only using nearby data points. If there is no interaction, the 2D ALE is zero everywhere.\n",
    "- **2D PD** shows the joint marginal effect, which includes both 1D effects and interactions. It does not account for correlations and can produce misleading results when features are dependent.\n",
    "\n",
    "Use 2D ALE when you want to identify genuine feature interactions, especially with correlated features. Use 2D PD when features are independent and you want the full joint effect."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.8.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}