Plotting the loss graph

Feyn version: 2.1+

Last updated: 13/10/2021

In this example, we'll explore a few different ways we can get a traditional loss graph during training with feyn and the QLattice.

Training using the primitives

auto_run is great, but we can actually accomplish exactly the same behaviour just using the primitives, and it allows us to inject this plot in the flow quite easily. You can read more about how auto_run can be decomposed here. I am going to assume you've read that and know how the standard fitting loop works.

In broad strokes, you:

Sample models from the QLattice, and add them to your collection of models
Fit all the models, which will rank them according to your criterion ('aic', 'bic' or None for loss)
(Optional) Prune the worst performing models according to a decay function. This helps performance a lot by reducing the time spent fitting bad models.
Update the QLattice with your ranked list of models

Repeat process until done.

Preparation

First, we'll go through a few steps to prepare ourselves for training

Loading dependencies and a dataset

import feyn
from sklearn.datasets import load_breast_cancer
import matplotlib.pyplot as plt
import numpy as np

data = load_breast_cancer(as_frame=True).frame

Let's define some useful variables about our problem type

# It's a classification problem, so let's use binary cross entropy
kind = 'classification'
loss_function = 'binary_cross_entropy'

# The output name in the dataframe is called 'target', as is standard for sklearn datasets.
output_name = 'target'

# How long to train, how many threads, ranking criterion
n_epochs = 20
threads = 4
criterion = 'bic'

# The all-important random_seed
random_seed = 42

Data validation and splitting

auto_run validates the data, so we might as well do the same.

We then split it in train and test. We will also split the train set again into a validation set, that we'll use later!

feyn.validate_data(data, kind, output_name)
train, test = feyn.tools.split(data, random_state=random_seed)
train, validation = feyn.tools.split(train, random_state=random_seed)

And connect to our QLattice

ql = feyn.connect_qlattice()

A simple approach

Defining a helper function for plotting the loss

We define a simple function we can call using matplotlib, that will display our loss values over time

def plot_losses(loss_values, epoch, n_epochs):
    x0 = list(range(1, epoch+1))
    plt.figure(figsize=(5, 2))
    plt.plot(x0, loss_values)
    plt.title('Model loss')
    plt.show()

Training loop

This is the standard fitting loop, with the addition of recording the best loss at every epoch and plotting it.

I've inserted comments in the places where something special is happening:

ql.reset(random_seed)

# Keeping track of the loss values over time
train_losses = []

models = []
for epoch in range(1, n_epochs+1):
    models += ql.sample_models(train, output_name, kind)

    models = feyn.fit_models(models, train, threads=threads, loss_function=loss_function, criterion=criterion)
    models = feyn.prune_models(models)

    # Append the latest loss value of the top model and display the loss with our function
    train_losses.append(models[0].loss_value)
    plot_losses(train_losses, epoch, n_epochs)

    # Note: because we use IPython.display (update_display=True) in show_model, the order here is important.

    feyn.show_model(
        models[0],
        # Just a simple label. Auto_run is more sophisticated.
        label=f"Epoch {epoch}/{n_epochs}.",
        update_display=True,
    )

    ql.update(models)

models = feyn.get_diverse_models(models)

# Display the final model and the loss graph
plot_losses(train_losses, epoch, n_epochs)
models[0].show(update_display=True)

png

Adding the validation loss

Maybe you would like to instead plot the training loss against the validation loss?

It should be easy to modify the code above to plot both. One caveat is that we need to know what loss is being used by the trainer.

We're using binary cross entropy, since it's a classification task.

Below, we have a modified plot_losses function, as well as an adjusted training loop to now also compute and keep track of the validation losses.

def plot_losses(train_losses, val_losses, epoch, n_epochs):
    x0 = list(range(1, epoch+1))
    plt.figure(figsize=(5, 2))
    plt.plot(x0, train_losses, label='Train loss')
    plt.plot(x0, val_losses, label='Validation loss')
    plt.title('Model loss')
    plt.legend()
    plt.show()

from feyn.losses import binary_cross_entropy

ql.reset(random_seed)

# Keeping track of the loss values over time
train_losses = []
val_losses = []

models = []
for epoch in range(1, n_epochs+1):
    models += ql.sample_models(train, output_name, kind)

    models = feyn.fit_models(models, train, threads=threads, loss_function=loss_function, criterion=criterion)
    models = feyn.prune_models(models)

    train_losses.append(models[0].loss_value)

    # Calculate the mean binary cross entropy of the validation dataset
    val_loss = np.mean(binary_cross_entropy(validation[output_name], models[0].predict(validation)))
    val_losses.append(val_loss)

    # Plot both!
    plot_losses(train_losses, val_losses, epoch, n_epochs)

    feyn.show_model(
        models[0],
        label=f"Epoch {epoch}/{n_epochs}.",
        update_display=True,
    )

    ql.update(models)

models = feyn.get_diverse_models(models)

# Display the final model and the loss graph
plot_losses(train_losses, val_losses, epoch, n_epochs)
models[0].show(update_display=True)

png

A zoomed graph

When training models with feyn, you'll notice that the improvements tends to go in "jumps" as it finds more useful models. This is because we're learning the structure of the model, not just fitting them. As we find the next best structure, it may jump significantly.

To counteract this scaling making the plot harder to read, you can make a function to give atotal view of the loss development as well as a window of the last few epochs.

The lookbehind for the window here is not set in stone, and you can define it to your own liking. This one will either start it when the epochs are more than a fifth of the total epochs, or 50 epochs have passed - whichever comes first. I find that a window of more than 50 epochs is not super useful in long training runs.

def plot_losses(loss_values, val_losses, epoch, n_epochs):
    lookbehind = min(max(int(n_epochs / 5), 1), 50)

    x0 = list(range(1, epoch+1))
    if epoch <= lookbehind:
        # This is the same code as before
        plt.figure(figsize=(5, 2))
        plt.plot(x0, loss_values, label="Train loss")
        plt.plot(x0, val_losses, label="Validation loss")
        plt.title('Model loss')
    else:
        # This uses subplots to also plot a window of the n most recent epochs
        _, axs = plt.subplots(1, 2, figsize=(10, 2))
        lb_len = len(loss_values[-lookbehind:])

        axs[0].plot(x0, loss_values, label="Train loss")
        axs[0].plot(x0, val_losses, label="Validation loss")
        axs[0].set_title('Model loss')
        

        x1 = list(range(max(epoch-lb_len,1), epoch+1)[-lookbehind:])
        axs[1].plot(x1, loss_values[-lookbehind:], label="Train loss")
        axs[1].plot(x1, val_losses[-lookbehind:], label="Validation loss")
        axs[1].set_title(f'Model loss for last {lookbehind} epochs.')

    plt.tight_layout()
    plt.legend()
    plt.show()

Training again

This code is exactly the same as our previous case - the only change is in the plotting function above.

from feyn.losses import binary_cross_entropy

ql.reset(random_seed)

# Keeping track of the loss values over time
train_losses = []
val_losses = []

models = []
for epoch in range(1, n_epochs+1):
    models += ql.sample_models(train, output_name, kind)

    models = feyn.fit_models(models, train, threads=threads, loss_function=loss_function, criterion=criterion)
    models = feyn.prune_models(models)

    train_losses.append(models[0].loss_value)

    # Calculate the mean binary cross entropy of the validation dataset
    val_loss = np.mean(binary_cross_entropy(validation[output_name], models[0].predict(validation)))
    val_losses.append(val_loss)

    # Plot both!
    plot_losses(train_losses, val_losses, epoch, n_epochs)

    feyn.show_model(
        models[0],
        label=f"Epoch {epoch}/{n_epochs}.",
        update_display=True,
    )

    ql.update(models)

models = feyn.get_diverse_models(models)

# Display the final model and the loss graph
plot_losses(train_losses, val_losses, epoch, n_epochs)
models[0].show(update_display=True)

png

Evaluation and next steps

How you decide to use this is up to you. Now that you have access to this information, maybe a natural next step is to use the validation loss to do model selection.

Normally, you would also evaluate the model on the test (or holdout) set, but we'll save that for now.

Hopefully this gives you some inspiration for what you can achieve with the decomposed workflow!

Feyn Documentation

Classification

Regression

Regression

Classification

Regression

Training using the primitives

Preparation

Loading dependencies and a dataset

Data validation and splitting

A simple approach

Defining a helper function for plotting the loss

Training loop

Adding the validation loss

A zoomed graph

Training again

Evaluation and next steps

Subscribe to get news about Feyn and the QLattice.