The XAI module can be used to generate instance explanations for machine learning models.
This module currently only works with tabular datasets, requiring a trained machine learning model, and the training dataset. Then, you can explain specific instances from a test dataset to understand the feature contributions in the model’s prediction.
This creates useful visualizations which not only allow users to know the most important features of the model but also to help them evaluate that the model is paying attention to the right factors.
Additionally, feature importances that don’t make sense to a user who is knowledgeable of the problem may shed light on issues with the model’s prediction method.
Using the XAI Model
To use the XAI Model:
-
Create the dataset you want to explain using the Virtualitics AI Platform’s Dataset Asset. In order for the explainer to work, the Dataset Asset needs to specify which features are categorical. Provide any additional necessary information for the Dataset Asset to be able to convert the categorical columns to other types of encodings.
-
Import the Explainer Asset from predict_backend.ml.xai and initialize it.
-
Call the Explainer’s explain method on a select few instances or use smart instance selection to automatically pick certain instances for you.
# Create explainer data asset on a one hot encoded dataset created by pd.get_dummies
# modeling_data is the whole pd.DataFrame
# ohe_features is a list of the one hot encoded columns
# categorical_cols is a list of column names which correspond to categorical features
# i.e. if the only categorical feature was "Animal" which was tranformed by pd.get_dummies
# into the columns "Animal_cat", "Animal_dog", then categorical_cols = ["Animal"]
# predict_cols is the list of input features for the model
# encoding is a string represneting the encoding of the data. Options are "one_hot", "verbose", or "ordinal"
explain_data = Dataset(modeling_data[ohe_features], label="example", name="explainer dataset",
categorical_cols=categorical_cols, predict_cols=ohe_features, encoding="one-hot")
# Train model
xgb = XGBRegressor(learning_rate=0.1, n_jobs=-1, n_estimators=100, max_depth=5, eval_metric='mae')
xgb.fit(modeling_data[ohe_features], modeling_data[target])
# Create explainer for ensemble model
explainer = Explainer(model=xgb, training_data=explain_data, output_names=target, mode='regression',
label="example", name="xgb explainer",
use_shap=True)
# Specify instance sets to explain and create Cards
low = actual_test_samples.loc[actual_test_samples['Customer ID'] == 'L4646119'] #finance
high = actual_test_samples.loc[actual_test_samples['Customer ID'] == 'L9257872'] #finance
explain_instances = pd.concat([low, high])
titles = ["Low Default Probability", "High Default Probability"]
plots = explainer.explain(explain_instances[ohe_features], method='manual', titles=titles, expected_title="Average Probability of Default in next 30 days.",
predicted_title="Predicted Probability of Default in next 30 days.", return_as="plots")
xai_plots_dash = Dashboard(plots)
What to Expect
See the example image below:
Additional Details
-
The Explainer Asset can also use Model Assets rather than the models themselves.
-
Within the Explainer’s
explain
function, in addition to the manual mode where specific instances are passed in, a smart mode allows the function to automatically pick interesting instances.
Previous Article |
Next Article |