by: Chris Cave
(Feyn version 2.0.1 or newer)
The summary of a
Model can be seen using the
plot function on the
Model. This provides a graph visualisation of the
Model that colours the nodes of the graph and metrics. The metrics of the
Model on the data passed are:
- For regressors:
- R2 score,
- Root mean squared error,
- Mean absolute error
- For classifiers:
- Accuracy at 0.5 threshold,
- Precision at 0.5 threshold,
- Recall at 0.5 threshold.
Here is an example:
import feyn from sklearn.datasets import load_boston import pandas as pd from feyn.tools import split #Download boston housing dataset boston = load_boston() df_boston = pd.DataFrame(boston.data, columns=boston.feature_names) df_boston['PRICE'] = boston.target # Train/test split train, test = split(df_boston) # Connect to QLattice ql = feyn.connect_qlattice() models = ql.auto_run( data=train, output_name='PRICE' ) # Select the best Model best = models best.plot( data=train, compare_data=test )
Colours on a graph
The nodes in the graph returned from the
plot function are coloured and have values above them.
The values show the paths in the graph that are important or unnecessary in the
Model. If the value of the node has increased significantly (about 0.05) compared to its inputs then those inputs are important. Otherwise the node is likely to be unnecessary.
If there is redundancy then run a simulation with lower
max_complexity and with a
criterion. You will get a
Model with less unnecessary paths without sacrificing much on performance.
Repeating this process with many iterations enables you to decide the
Model with the correct balance of interpreability and performance for your dataset.
The values are the Pearson's correlation coefficient of the activation values of the node and the output variable. The colours correspond to the correlation coefficient, where
-1 is represented by red,
0 by white and
1 by green.
Can take a single dataframe to compare with, or a list of dataframes - in case you want to compare multiple datasets or splits with each other
Takes a correlation function amongst
['pearson', 'spearman', 'mutual_information'] to compute the correlations at each interaction in the model.
You can pass in custom labels for the header of the summary metrics. If your list of labels is shorter than the compare_data argument, numbered labels will be generated for the rest.
This function can also be found in
from feyn.plots import plot_model_summary plot_model_summary(best, train)