2D Feature Effects

2D ALE and PD plots reveal how pairs of features jointly affect predictions.

[ ]:
import skexplain
import plotting_config
from matplotlib.ticker import MaxNLocator
[ ]:
# Load the training data and pre-fit models
estimators = skexplain.load_models()
X, y = skexplain.load_data()
X = X.astype({'urban': 'category', 'rural': 'category'})

explainer = skexplain.ExplainToolkit(estimators, X=X, y=y)

explainer.set_plotting_config(
    display_feature_names=plotting_config.display_feature_names,
    display_units=plotting_config.display_units,
)

2D ALE

2D ALE measures the additional contribution from the combination of two features, minus their respective 1D effects. If there is no multiplicative interaction between two features, then the 2D ALE is zero everywhere.

[ ]:
features = [('temp2m', 'sfc_temp'), ('dwpt2m', 'sfc_temp'), ('temp2m', 'dwpt2m')]

ale_2d_ds = explainer.ale(
    features=features,
    n_bootstrap=1,
    subsample=1.0,
    n_jobs=len(features) * len(estimators),
    n_bins=30,
)
[ ]:
# Default plot includes KDE contours and scatter overlays
fig, axes = explainer.plot_ale(ale=ale_2d_ds)

Customizing 2D Plots

The marginal distributions, scatter points, and KDE contours can be toggled on or off.

[ ]:
# Recompute for a single feature pair for cleaner examples
features_single = [('temp2m', 'sfc_temp')]

ale_2d_single = explainer.ale(
    features=features_single,
    n_bootstrap=1,
    subsample=1.0,
    n_jobs=len(features_single) * len(estimators),
    n_bins=30,
)

# Plot without KDE or scatter overlays
fig, axes = explainer.plot_ale(
    ale=ale_2d_single,
    kde_curve=False,
    scatter=False,
)
[ ]:
# Plot as smoothed filled contours instead of pixel-style image
fig, axes = explainer.plot_ale(
    ale=ale_2d_single,
    kde_curve=False,
    scatter=False,
    contours=True,
)

Custom Colorbar and Sizing

You can control the colorbar, figure size, font size, and number of histogram bins.

[ ]:
cbar_kwargs = {'extend': 'neither', 'ticks': MaxNLocator(3)}

fig, axes = explainer.plot_ale(
    ale=ale_2d_single,
    kde_curves=False,
    scatter=False,
    figsize=(10, 5),
    fontsize=10,
    cbar_kwargs=cbar_kwargs,
    bins=7,
)

2D Partial Dependence

2D PD shows the joint marginal effect of two features. Unlike 2D ALE, it does not subtract the 1D effects.

[ ]:
features_2d = [('temp2m', 'sfc_temp'), ('dwpt2m', 'sfc_temp'), ('temp2m', 'dwpt2m')]

pd_2d_ds = explainer.pd(
    features=features_2d,
    n_bootstrap=1,
    subsample=1000,
    n_jobs=len(features_2d) * 3,
    n_bins=10,
)

fig, axes = explainer.plot_pd(pd=pd_2d_ds)

2D ALE vs PD

Both 2D ALE and 2D PD can reveal feature interactions, but they differ in important ways:

  • 2D ALE isolates the pure interaction effect by subtracting the 1D effects. It accounts for feature correlations by only using nearby data points. If there is no interaction, the 2D ALE is zero everywhere.

  • 2D PD shows the joint marginal effect, which includes both 1D effects and interactions. It does not account for correlations and can produce misleading results when features are dependent.

Use 2D ALE when you want to identify genuine feature interactions, especially with correlated features. Use 2D PD when features are independent and you want the full joint effect.