dice_ml.explainer_interfaces package

Submodules

dice_ml.explainer_interfaces.dice_pytorch module

Module to generate diverse counterfactual explanations based on PyTorch framework

class dice_ml.explainer_interfaces.dice_pytorch.DicePyTorch(data_interface, model_interface)[source]

Bases: dice_ml.explainer_interfaces.explainer_base.ExplainerBase

compute_dist(x_hat, x1)[source]

Compute weighted distance between two vectors.

compute_diversity_loss()[source]

Computes the third part (diversity) of the loss function.

compute_loss()[source]

Computes the overall loss

compute_proximity_loss()[source]

Compute the second part (distance from x1) of the loss function.

compute_regularization_loss()[source]

Adds a linear equality constraints to the loss functions - to ensure all levels of a categorical variable sums to one

compute_yloss()[source]

Computes the first part (y-loss) of the loss function.

do_cf_initializations(total_CFs, algorithm, features_to_vary)[source]

Intializes CFs and other related variables.

do_loss_initializations(yloss_type, diversity_loss_type, feature_weights)[source]

Intializes variables related to main loss function

do_optimizer_initializations(optimizer, learning_rate)[source]

Initializes gradient-based PyTorch optimizers.

dpp_style(submethod)[source]

Computes the DPP of a matrix.

find_counterfactuals(query_instance, desired_class, optimizer, learning_rate, min_iter, max_iter, project_iter, loss_diff_thres, loss_converge_maxiter, verbose, init_near_query_instance, tie_random, stopping_threshold, posthoc_sparsity_param, posthoc_sparsity_algorithm)[source]

Finds counterfactuals by graident-descent.

generate_counterfactuals(query_instance, total_CFs, desired_class='opposite', proximity_weight=0.5, diversity_weight=1.0, categorical_penalty=0.1, algorithm='DiverseCF', features_to_vary='all', yloss_type='hinge_loss', diversity_loss_type='dpp_style:inverse_dist', feature_weights='inverse_mad', optimizer='pytorch:adam', learning_rate=0.05, min_iter=500, max_iter=5000, project_iter=0, loss_diff_thres=1e-05, loss_converge_maxiter=1, verbose=False, init_near_query_instance=True, tie_random=False, stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear')[source]

Generates diverse counterfactual explanations

Parameters
  • query_instance – A dictionary of feature names and values. Test point of interest.

  • total_CFs – Total number of counterfactuals required.

  • desired_class – Desired counterfactual class - can take 0 or 1. Default value is “opposite” to the outcome class of query_instance for binary classification.

  • proximity_weight – A positive float. Larger this weight, more close the counterfactuals are to the query_instance.

  • diversity_weight – A positive float. Larger this weight, more diverse the counterfactuals are.

  • categorical_penalty – A positive float. A weight to ensure that all levels of a categorical variable sums to 1.

  • algorithm – Counterfactual generation algorithm. Either “DiverseCF” or “RandomInitCF”.

  • features_to_vary – Either a string “all” or a list of feature names to vary.

  • yloss_type – Metric for y-loss of the optimization function. Takes “l2_loss” or “log_loss” or “hinge_loss”.

  • diversity_loss_type – Metric for diversity loss of the optimization function. Takes “avg_dist” or “dpp_style:inverse_dist”.

  • feature_weights – Either “inverse_mad” or a dictionary with feature names as keys and corresponding weights as values. Default option is “inverse_mad” where the weight for a continuous feature is the inverse of the Median Absolute Devidation (MAD) of the feature’s values in the training set; the weight for a categorical feature is equal to 1 by default.

  • optimizer – PyTorch optimization algorithm. Currently tested only with “pytorch:adam”.

  • learning_rate – Learning rate for optimizer.

  • min_iter – Min iterations to run gradient descent for.

  • max_iter – Max iterations to run gradient descent for.

  • project_iter – Project the gradients at an interval of these many iterations.

  • loss_diff_thres – Minimum difference between successive loss values to check convergence.

  • loss_converge_maxiter – Maximum number of iterations for loss_diff_thres to hold to declare convergence. Defaults to 1, but we assigned a more conservative value of 2 in the paper.

  • verbose – Print intermediate loss value.

  • init_near_query_instance – Boolean to indicate if counterfactuals are to be initialized near query_instance.

  • tie_random – Used in rounding off CFs and intermediate projection.

  • stopping_threshold – Minimum threshold for counterfactuals target class probability.

  • posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.

  • posthoc_sparsity_algorithm – Perform either linear or binary search. Takes “linear” or “binary”. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.

Returns

A CounterfactualExamples object to store and visualize the resulting counterfactual explanations (see diverse_counterfactuals.py).

get_model_output(input_instance)[source]

get output probability of ML model

initialize_CFs(query_instance, init_near_query_instance=False)[source]

Initialize counterfactuals.

predict_fn(input_instance)[source]

prediction function

round_off_cfs(assign=False)[source]

function for intermediate projection of CFs.

stop_loop(itr, loss_diff)[source]

Determines the stopping condition for gradient descent.

update_hyperparameters(proximity_weight, diversity_weight, categorical_penalty)[source]

Update hyperparameters of the loss function

dice_ml.explainer_interfaces.dice_tensorflow1 module

Module to generate diverse counterfactual explanations based on tensorflow 1.x

class dice_ml.explainer_interfaces.dice_tensorflow1.DiceTensorFlow1(data_interface, model_interface)[source]

Bases: dice_ml.explainer_interfaces.explainer_base.ExplainerBase

compute_dist(x_hat, x1)[source]

Compute weighted distance between two vectors.

compute_diversity_loss(method)[source]

Computes the third part (diversity) of the loss function.

compute_proximity_loss()[source]

Compute the second part (distance from x1) of the loss function.

compute_regularization_loss()[source]

Adds a linear equality constraints to the loss functions - to ensure all levels of a categorical variable sums to one

compute_yloss(method)[source]

Computes the first part (y-loss) of the loss function.

do_cf_initializations(total_CFs, algorithm, features_to_vary)[source]

Intializes TF variables required for CF generation.

do_loss_initializations(yloss_type='hinge_loss', diversity_loss_type='dpp_style:inverse_dist', feature_weights='inverse_mad')[source]

Defines the optimization loss

do_optimizer_initializations(optimizer)[source]

Initializes gradient-based TF optimizers.

dpp_style(submethod)[source]

Computes the DPP of a matrix.

find_counterfactuals(query_instance, desired_class='opposite', learning_rate=0.05, min_iter=500, max_iter=5000, project_iter=0, loss_diff_thres=1e-05, loss_converge_maxiter=1, verbose=False, init_near_query_instance=False, tie_random=False, stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear')[source]

Finds counterfactuals by graident-descent.

generate_counterfactuals(query_instance, total_CFs, desired_class='opposite', proximity_weight=0.5, diversity_weight=1.0, categorical_penalty=0.1, algorithm='DiverseCF', features_to_vary='all', yloss_type='hinge_loss', diversity_loss_type='dpp_style:inverse_dist', feature_weights='inverse_mad', optimizer='tensorflow:adam', learning_rate=0.05, min_iter=500, max_iter=5000, project_iter=0, loss_diff_thres=1e-05, loss_converge_maxiter=1, verbose=False, init_near_query_instance=True, tie_random=False, stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear')[source]

Generates diverse counterfactual explanations

Parameters
  • query_instance – A dictionary of feature names and values. Test point of interest.

  • total_CFs – Total number of counterfactuals required.

  • desired_class – Desired counterfactual class - can take 0 or 1. Default value is “opposite” to the outcome class of query_instance for binary classification.

  • proximity_weight – A positive float. Larger this weight, more close the counterfactuals are to the query_instance.

  • diversity_weight – A positive float. Larger this weight, more diverse the counterfactuals are.

  • categorical_penalty – A positive float. A weight to ensure that all levels of a categorical variable sums to 1.

  • algorithm – Counterfactual generation algorithm. Either “DiverseCF” or “RandomInitCF”.

  • features_to_vary – Either a string “all” or a list of feature names to vary.

  • yloss_type – Metric for y-loss of the optimization function. Takes “l2_loss” or “log_loss” or “hinge_loss”.

  • diversity_loss_type – Metric for diversity loss of the optimization function. Takes “avg_dist” or “dpp_style:inverse_dist”.

  • feature_weights – Either “inverse_mad” or a dictionary with feature names as keys and corresponding weights as values. Default option is “inverse_mad” where the weight for a continuous feature is the inverse of the Median Absolute Devidation (MAD) of the feature’s values in the training set; the weight for a categorical feature is equal to 1 by default.

  • optimizer – Tensorflow optimization algorithm. Currently tested only with “tensorflow:adam”.

  • learning_rate – Learning rate for optimizer.

  • min_iter – Min iterations to run gradient descent for.

  • max_iter – Max iterations to run gradient descent for.

  • project_iter – Project the gradients at an interval of these many iterations.

  • loss_diff_thres – Minimum difference between successive loss values to check convergence.

  • loss_converge_maxiter – Maximum number of iterations for loss_diff_thres to hold to declare convergence. Defaults to 1, but we assigned a more conservative value of 2 in the paper.

  • verbose – Print intermediate loss value.

  • init_near_query_instance – Boolean to indicate if counterfactuals are to be initialized near query_instance.

  • tie_random – Used in rounding off CFs and intermediate projection.

  • stopping_threshold – Minimum threshold for counterfactuals target class probability.

  • posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.

  • posthoc_sparsity_algorithm – Perform either linear or binary search. Takes “linear” or “binary”. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.

Returns

A CounterfactualExamples object to store and visualize the resulting counterfactual explanations (see diverse_counterfactuals.py).

initialize_CFs(query_instance, init_near_query_instance=False)[source]

Initialize counterfactuals.

predict_fn(input_instance)[source]

prediction function

round_off_cfs(assign=False)[source]

function for intermediate projection of CFs.

scipy_optimizers(method='Nelder-Mead')[source]
stop_loop(itr, loss_diff)[source]

Determines the stopping condition for gradient descent.

tensorflow_optimizers(method='adam')[source]

Initializes tensorflow optimizers.

update_hyperparameters(proximity_weight=0.5, diversity_weight=0.5, categorical_penalty=0.1)[source]

Updates hyperparameters.

dice_ml.explainer_interfaces.dice_tensorflow2 module

Module to generate diverse counterfactual explanations based on tensorflow 2.x

class dice_ml.explainer_interfaces.dice_tensorflow2.DiceTensorFlow2(data_interface, model_interface)[source]

Bases: dice_ml.explainer_interfaces.explainer_base.ExplainerBase

compute_dist(x_hat, x1)[source]

Compute weighted distance between two vectors.

compute_diversity_loss()[source]

Computes the third part (diversity) of the loss function.

compute_loss()[source]

Computes the overall loss

compute_proximity_loss()[source]

Compute the second part (distance from x1) of the loss function.

compute_regularization_loss()[source]

Adds a linear equality constraints to the loss functions - to ensure all levels of a categorical variable sums to one

compute_yloss()[source]

Computes the first part (y-loss) of the loss function.

do_cf_initializations(total_CFs, algorithm, features_to_vary)[source]

Intializes CFs and other related variables.

do_loss_initializations(yloss_type, diversity_loss_type, feature_weights)[source]

Intializes variables related to main loss function

do_optimizer_initializations(optimizer, learning_rate)[source]

Initializes gradient-based TensorFLow optimizers.

dpp_style(submethod)[source]

Computes the DPP of a matrix.

find_counterfactuals(query_instance, desired_class, optimizer, learning_rate, min_iter, max_iter, project_iter, loss_diff_thres, loss_converge_maxiter, verbose, init_near_query_instance, tie_random, stopping_threshold, posthoc_sparsity_param, posthoc_sparsity_algorithm)[source]

Finds counterfactuals by graident-descent.

generate_counterfactuals(query_instance, total_CFs, desired_class='opposite', proximity_weight=0.5, diversity_weight=1.0, categorical_penalty=0.1, algorithm='DiverseCF', features_to_vary='all', yloss_type='hinge_loss', diversity_loss_type='dpp_style:inverse_dist', feature_weights='inverse_mad', optimizer='tensorflow:adam', learning_rate=0.05, min_iter=500, max_iter=5000, project_iter=0, loss_diff_thres=1e-05, loss_converge_maxiter=1, verbose=False, init_near_query_instance=True, tie_random=False, stopping_threshold=0.5, posthoc_sparsity_param=0.1, posthoc_sparsity_algorithm='linear')[source]

Generates diverse counterfactual explanations

Parameters
  • query_instance – A dictionary of feature names and values. Test point of interest.

  • total_CFs – Total number of counterfactuals required.

  • desired_class – Desired counterfactual class - can take 0 or 1. Default value is “opposite” to the outcome class of query_instance for binary classification.

  • proximity_weight – A positive float. Larger this weight, more close the counterfactuals are to the query_instance.

  • diversity_weight – A positive float. Larger this weight, more diverse the counterfactuals are.

  • categorical_penalty – A positive float. A weight to ensure that all levels of a categorical variable sums to 1.

  • algorithm – Counterfactual generation algorithm. Either “DiverseCF” or “RandomInitCF”.

  • features_to_vary – Either a string “all” or a list of feature names to vary.

  • yloss_type – Metric for y-loss of the optimization function. Takes “l2_loss” or “log_loss” or “hinge_loss”.

  • diversity_loss_type – Metric for diversity loss of the optimization function. Takes “avg_dist” or “dpp_style:inverse_dist”.

  • feature_weights – Either “inverse_mad” or a dictionary with feature names as keys and corresponding weights as values. Default option is “inverse_mad” where the weight for a continuous feature is the inverse of the Median Absolute Devidation (MAD) of the feature’s values in the training set; the weight for a categorical feature is equal to 1 by default.

  • optimizer – Tensorflow optimization algorithm. Currently tested only with “tensorflow:adam”.

  • learning_rate – Learning rate for optimizer.

  • min_iter – Min iterations to run gradient descent for.

  • max_iter – Max iterations to run gradient descent for.

  • project_iter – Project the gradients at an interval of these many iterations.

  • loss_diff_thres – Minimum difference between successive loss values to check convergence.

  • loss_converge_maxiter – Maximum number of iterations for loss_diff_thres to hold to declare convergence. Defaults to 1, but we assigned a more conservative value of 2 in the paper.

  • verbose – Print intermediate loss value.

  • init_near_query_instance – Boolean to indicate if counterfactuals are to be initialized near query_instance.

  • tie_random – Used in rounding off CFs and intermediate projection.

  • stopping_threshold – Minimum threshold for counterfactuals target class probability.

  • posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.

  • posthoc_sparsity_algorithm – Perform either linear or binary search. Takes “linear” or “binary”. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.

Returns

A CounterfactualExamples object to store and visualize the resulting counterfactual explanations (see diverse_counterfactuals.py).

initialize_CFs(query_instance, init_near_query_instance=False)[source]

Initialize counterfactuals.

predict_fn(input_instance)[source]

prediction function

round_off_cfs(assign=False)[source]

function for intermediate projection of CFs.

stop_loop(itr, loss_diff)[source]

Determines the stopping condition for gradient descent.

update_hyperparameters(proximity_weight, diversity_weight, categorical_penalty)[source]

Update hyperparameters of the loss function

dice_ml.explainer_interfaces.explainer_base module

Module containing a template class to generate counterfactual explanations. Subclasses implement interfaces for different ML frameworks such as TensorFlow or PyTorch. All methods are in dice_ml.explainer_interfaces

class dice_ml.explainer_interfaces.explainer_base.ExplainerBase(data_interface)[source]

Bases: object

do_posthoc_sparsity_enhancement(final_cfs_sparse, cfs_preds_sparse, query_instance, posthoc_sparsity_param, posthoc_sparsity_algorithm)[source]

Post-hoc method to encourage sparsity in a generated counterfactuals.

Parameters
  • final_cfs_sparse – List of final CFs in numpy format.

  • cfs_preds_sparse – List of predicted outcomes of final CFs in numpy format.

  • query_instance – Query instance in numpy format.

  • posthoc_sparsity_param – Parameter for the post-hoc operation on continuous features to enhance sparsity.

  • posthoc_sparsity_algorithm – Perform either linear or binary search. Prefer binary search when a feature range is large (for instance, income varying from 10k to 1000k) and only if the features share a monotonic relationship with predicted outcome in the model.

generate_counterfactuals()[source]

Performs a binary search between continuous features of a CF and corresponding values in query_instance until the prediction class changes.

Performs a greedy linear search - moves the continuous features in CFs towards original values in query_instance greedily until the prediction class changes.

dice_ml.explainer_interfaces.feasible_base_vae module

class dice_ml.explainer_interfaces.feasible_base_vae.FeasibleBaseVAE(data_interface, model_interface, **kwargs)[source]

Bases: dice_ml.explainer_interfaces.explainer_base.ExplainerBase

compute_loss(model_out, x, target_label)[source]
generate_counterfactuals(query_instance, total_CFs, desired_class='opposite')[source]
train(pre_trained=False)[source]

pre_trained: Bool Variable to check whether pre trained model exists to avoid training again

dice_ml.explainer_interfaces.feasible_model_approx module

class dice_ml.explainer_interfaces.feasible_model_approx.FeasibleModelApprox(data_interface, model_interface, **kwargs)[source]

Bases: dice_ml.explainer_interfaces.feasible_base_vae.FeasibleBaseVAE, dice_ml.explainer_interfaces.explainer_base.ExplainerBase

train(constraint_type, constraint_variables, constraint_direction, constraint_reg, pre_trained=False)[source]
Parameters
  • pre_trained – Bool Variable to check whether pre trained model exists to avoid training again

  • constraint_type – Binary Variable currently: (1) unary / (0) monotonic

  • constraint_variables – List of List: [[Effect, Cause1, Cause2, …. ]]

  • constraint_direction – -1: Negative, 1: Positive ( By default has to be one for monotonic constraints )

  • constraint_reg – Tunable Hyperparamter

:return None

Module contents