# Inspection

The `inspection`

package contains experimental tools to help get insights from the `Feyn`

graphs. They can be found inside `__future__.contrib.inspection`

:

# pdheatmap

```
plot_importance_dataframe(graph, dataframe, bg_data=None, kind='bar', legend=True, cmap='RdYlGn')
""" Paints a Pandas DataFrame according to feature importance in the provided model.
Parameters:
graph: The feyn graph to predict on
dataframe: The dataframe to paint
[bg_data]: The dataframe to use for background data to get more accurate importance scores
[kind]: The kind of coloring. Options: 'fill' or 'bar'
[cmap]: The colormap to use (if kind='fill')
Returns:
A Styled Pandas DataFrame with a color gradient or bar fill according to feature importances. """
```

Basic usage:

```
from feyn.__future__.contrib.inspection import plot_importance_dataframe
plot_importance_dataframe(graph, df, bg_data=train)
```

This will display your pandas dataframe `df`

as a bar chart, showing which features contributed in what direction to the final prediction for the given graph.

# KernelShap

```
class KernelShap:
__init__(self, graph, background_data):
"""
Constructs a new 'KernelShap' object
Arguments:
graph {feyn.Graph} -- A feyn Graph object
background_data {pandas.DataFrame} -- Usually the train DataFrame, that the model was fitted to.
"""
SHAP(self, instances, n_samples='auto', format='numpy'):
"""
Approximates the SHAP values of instances.
Keyword arguments:
instances -- A collection of datapoints that one wants to find their SHAP values. This is a pandas.DataFrame object
n_samples -- This is either 'auto' or an integer. When n_samples in an integer it is amount of sampling for each combination of subsets of the features in the model. The higher the number the more accurate the approximation but the slower the calculation.
The keyword 'auto' varies the amount of sampling for each combination. More samples are taking with the smallest and largest subsets. Less for the subsets that are about half the size of the full subset of features.
format -- Either 'numpy' or 'pandas'. Determines what format the return values should be
Returns:
numpy.array --- Size #instances x #features where each entry represets the SHAP values of that feature for that particular instance.
pandas.DataFrame --- Each row is an instance and the columns are the SHAP values of the feature.,
"""
feature_plot(self, shap_values, figsize=(8, 4), show_graph=True):
"""
Finds feature importance from calculated SHAP values. This calculates the mean of the absolute value of each feature.
Keyword arguments:
shap_values --- A numpy array of shap values of chosen instances.
figsize --- A tuple that determines the size of the figure.
show_graph --- Boolean that displays the graph along with the figure.
Returns:
matplotlib horizontal bar chart. On the y-axis is each feature and the x-axis the mean absolute value of each feature.
"""
```

Basic usage:

```
from feyn.__future__.contrib.inspection import KernelShap
shap = KernelShap(graph, background_data=train)
importances = shap.SHAP(df, format='pandas') # Pandas optional, also supports numpy.
feature_plot(importances)
```

This calculates the SHAP values of your dataset from your graph, using your training data as background (recommended), and plots it.

Once you have the SHAP values you can also use any existing SHAP plotting library to visualize it, if you prefer.

# get_activations_df

```
get_activations_df(dataframe, graph):
"""
Returns a pandas dataframe of activations of each interaction for a given graph.
Arguments:
dataframe {[pd.DataFrame]} -- The datapoints to find the activations of.
graph {[feyn.Graph]} -- The graph with the interactions.
Returns:
[pd.Dataframe] -- Extended original dataframe of feature and activations values.
"""
```

Basic usage:

```
from feyn.__future__.contrib.inspection import get_activations_df
act_df = get_activations_df(dataframe=train, graph)
```

This calculates the activation values of each interaction in the `graph`

of your dataset.

# plot_interaction

```
plot_interaction(graph,interaction,dataframe,scaling = None):
"""
Plots the activation values of datapoints of an interaction. If the interaction is a register then it plots the distribution of points.
Arguments:
graph {[feyn.Graph]} -- The graph to calculate activation values
interaction {[feyn.Interaction]} -- The interaction to plot
dataframe {[pd.DataFrame]} -- The datapoints to plot
scaling {[list of strings]} -- The features that will display scaling on axis
Returns:
[plotly.graph_objs._figure.Figure] -- Either a scatter plot of datapoints or a histogram of the feature distribution.
"""
```

Basic usage:

```
from feyn.__future__.contrib.inspection import plot_interaction
# Second interaction is not a feature or the output
int_num = 2
interaction = graph[2]
#graph[0] is an input to graph[2]
fig = plot_interaction(graph, interaction = interaction, dataframe = train, scaling = [graph[0].name])
```

This returns a plot of the interaction. If the interaction is a `register`

then this displays a histogram of the values in `dataframe`

. If the interaction is the `output`

then this displays two histograms of the values with the `actual`

values and the `predicted`

values. If the interaction is `hidden`

, that is between input and output, then this displays a plot of the activations. If the interaction has `one input`

then this displays a histogram with an activation plot. If the interaction has `two inputs`

then this is a scatter plot where the colour corresponds to the activation.

# plot_categories

```
plot_categories(register):
"""Plots the weights of each category of a categorical register
Arguments:
register {feyn._Regsiter} -- Categorical register
Returns:
[plotly.graph_objs._figure.Figure] -- Bar chart of weights
"""
```

Basic usage:

```
from feyn.__future__.contrib.inspection import plot_categories
register = graph[0]
plot_categories(register)
```

This plots the weights of the `categorical`

register.