Inspection
The inspection
package contains experimental tools to help get insights from the Feyn
graphs. They can be found inside __future__.contrib.inspection
:
plot_importance_dataframe
def plot_importance_dataframe(graph,
dataframe,
bg_data=None,
kind='bar',
legend=True,
cmap='RdYlGn',
max_ref_samples=100):
""" Paints a Pandas DataFrame according to feature importance in the provided model.
Arguments:
graph {feyn.Graph} -- The feyn graph to predict on
dataframe {pd.DataFrame} -- The dataframe to paint
Keyword Arguments:
bg_data {pd.DataFrame} -- The dataframe to use for background data to get more accurate importance scores (default: {None})
kind {str} -- The kind of coloring. Options: 'fill' or 'bar' (default: {'bar'})
legend {bool} -- Show a legend (if kind='fill') (default: {True})
cmap {str} -- The colormap to use (if kind='fill') (default: {'RdYlGn'})
max_ref_samples {int} -- How many reference samples to take to define the limits of the plot (default: {100})
Raises:
Exception: if bad [kind] argument supplied.
Returns:
[pd.DataFrame.styling] -- A dataframe styled with colors according to importances
"""
Basic usage:
from feyn.__future__.contrib.inspection import plot_importance_dataframe
plot_importance_dataframe(graph, df, bg_data=train)
This will display your pandas dataframe df
as a bar chart, showing which features contributed in what direction to the final prediction for the given graph.
get_activations_df
def get_activations_df(dataframe, graph):
"""
Returns a pandas dataframe of activations of each interaction for a given graph.
Arguments:
dataframe {[pd.DataFrame]} -- The datapoints to find the activations of.
graph {[feyn.Graph]} -- The graph with the interactions.
Returns:
[pd.Dataframe] -- Extended original dataframe of feature and activations values.
"""
Basic usage:
from feyn.__future__.contrib.inspection import get_activations_df
act_df = get_activations_df(dataframe=train, graph)
This calculates the activation values of each interaction in the graph
of your dataset.
plot_interaction
def plot_interaction(graph, interaction, dataframe, scaling = None):
"""
Plots the activation values of datapoints of an interaction. If the interaction is a register then it plots the distribution of points.
Arguments:
graph {[feyn.Graph]} -- The graph to calculate activation values
interaction {[feyn.Interaction]} -- The interaction to plot
dataframe {[pd.DataFrame]} -- The datapoints to plot
scaling {[list of strings]} -- The features that will display scaling on axis
Returns:
[plotly.graph_objs._figure.Figure] -- Either a scatter plot of datapoints or a histogram of the feature distribution.
"""
Basic usage:
from feyn.__future__.contrib.inspection import plot_interaction
# Second interaction is not a feature or the output
int_num = 2
interaction = graph[int_num]
#graph[0] is an input to graph[2]
fig = plot_interaction(graph, interaction = interaction, dataframe = train, scaling = [graph[0].name])
This returns a plot of the interaction. If the interaction is a register
then this displays a histogram of the values in dataframe
. If the interaction is the output
then this displays two histograms of the values with the actual
values and the predicted
values. If the interaction is hidden
, that is between input and output, then this displays a plot of the activations. If the interaction has one input
then this displays a histogram with an activation plot. If the interaction has two inputs
then this is a scatter plot where the colour corresponds to the activation.
plot_categories
def plot_categories(register):
"""Plots the weights of each category of a categorical register
Arguments:
register {feyn._Regsiter} -- Categorical register
Returns:
[plotly.graph_objs._figure.Figure] -- Bar chart of weights
"""
Basic usage:
from feyn.__future__.contrib.inspection import plot_categories
register = graph[0]
plot_categories(register)
This plots the weights of the categorical
register.
feature_recurrence_qgraph
def feature_recurrence_qgraph(train_df,
target,
qlattice,
filters=None,
n_iter=10,
stypes:typing.Optional[str] = None,
n_fits=10,
n_features=1,
top_graphs=10,
threads=4,
get_qgtype='regressor'):
"""Uses the QLattice to extract simple models and
check which features are the most recurring.
Arguments:
train_df {pd.DataFrame} -- training set
target {str} -- target name
qlattice {feyn.QLattice} -- the QLattice
Keyword Arguments:
test_df -- a test dataset to evalulate on
filters -- list of filters to apply
n_iter {int} -- number of full training iterations (default: {10})
n_fits {int} -- number of updating loops (default: {10})
n_features {int} -- max number of features, between 1 and 4 (default: {1})
top_graphs {int} -- number of inspected graphs (default: {10})
threads {int} -- number of threads (default: {4})
get_qgtype {str} -- type of qgraph (default: {'regressor'})
Returns:
DataFrame that records in features in the top graphs from each iteration
"""
Basic usage:
from feyn.__future__.contrib.inspection import feature_recurrence_qgraph
ql = feyn.QLattice()
# This is the name of the target variable
target = 'output'
res_df = feature_recurrence_qgraph(train_df, target, ql)