# Fitting the QGraph

by: Kevin Broløs

## O' graph of graphs

Alright, you've got your data prepared, your `QLattice`

set up, your `registers`

.. er... Registered. What next?

We briefly touched upon the `QGraph`

as a concept. It's time to go a little deeper. Essentially, the `QGraph`

is the representation of the infinite amount of `graphs`

(or paths) conceivable through the `QLattice`

from your `input registers`

to your `output register`

, considering all possible combinations of interactions.

It's easy to imagine how this combinatorically explodes as `interactions`

happen between both the features and the transformations of features, so obviously it's not *really* the infinite set of graphs.

What you need to know though, is what it represents, and how you interact with it. So let's back up a bit.

## It's just graphs. Really.

Have you ever looked at a neural network and thought, "Huh, I wish this would just build itself. Oh, and also have all kinds of cool functions to choose from"?

Welcome to the `QGraph`

. This is essentially a testbed for all those weird ideas you had when you did your data analysis and considered what to use and not to use, except it does all that for you, and also comes with alternate suggestions. Initially, these suggestions will be pretty weird, but a few gems will emerge. Through updating the `QLattice`

as you evaluate all these suggestions, the search space narrows and more of these suggestions become relevant.

This is done through what we refer to as the `update loop`

.

## The update loop

While you can just extract a `QGraph`

, keep tuning it and never call the `QLattice`

again, that's not really taking advantage of the diversity of solutions available at your fingertips, and might cause you to converge to some pretty bad decisions made early on before the problem space was really well understood.

That's why we update the QLattice. Doing so, allows the `QLattice`

to hone in on your problem space and keep suggesting things that are useful, rather than just random stuff.

An example update loop could look like this, but can be as sophisticated as you want it to (think cross validation, ensemble solutions, only your imagination is the limit...)

```
no_loops = 3
for _ in range(no_loops):
# Get a QGraph. This will be biased with learnings from previous `updates`.
qgraph = qlattice.get_qgraph(data.columns, 'target')
# Now fit the local QGraph with your local data
qgraph.fit(train, epochs=10)
# Select the graph with lowest training loss as the best solution.
best_graph = qgraph.select(train)[0]
# Feed the experience back to the QLattice so it gets smarter for subsequent calls.
qlattice.update(best_graph)
```

## Add some thermal paste, it's about to get hot

I want to show you some of these `graphs`

, but let's first get into the fit function and how it works.

```
qgraph.fit(train, epochs=10)
```

When you call the fit function, it trains all the current graphs inside the `QGraph`

for the specified amount of epochs before letting you select the final `graph`

based off your favourite metric.

The average size of the `QGraph`

is currently in the `thousands`

, so you can consider this fitting a thousand different models, verifying it against your result and taking the best learnings with you for the next iteration where you train a thousand brand new ones, comparing it to your current best contenders.

You can decide the loss function you use among the ones in feyn.losses, using the `loss_function`

parameter.

```
from feyn import losses
qgraph.fit(train, loss_function=losses.mean_absolute_error)
```

### Threading the needle

Having many `graphs`

, means a lot of work to do. So if you have multiple cores in your CPU, you can take advantage of each of them, by declaring the amount of threads you want to make available in the `fit`

function.

```
qgraph.fit(train, loss_function=losses.mean_absolute_error, threads=4)
```

You're still going to have to end up with a final best (or multiple) `graph(s)`

to use as your model, so let's talk about the selection process.

### Survival of the fittest

You can select `graphs`

on your training set, or, even better, on your validation set.

```
best_graph = qgraph.select(train)[0]
```

You can use any kind of metric you wish for selecting your `graphs`

. The default is the `mean squared error`

calculated on the dataset you feed into the select function.

Here are a few examples:

```
best = qgraph.select(train, loss_function=losses.mean_absolute_error, n=1)[0]
top5 = qgraph.select(train, n=5)
```

## You promised me graphs

To get and render a selection of `graphs`

from a `QGraph`

, call the head function.

In a Jupyter or IPython environment, this will render the `graphs`

.

```
qgraph.head(n=3)
```

You can also render each individual graph, either by selecting them using the above selection process or accessing through `qgraph.head()`

.

## Plotting during fitting

By now you've probably also noticed that if you run in an IPython environment, the `Qgraph.fit()`

will display the current best (lowest loss) graph while it's training.

You can change that behaviour using the show parameter, disabling it entirely with `None`

, displaying text printouts with `text`

or adding in your own callback function to plot the metrics that matter the most to you.

```
# Examples
qgraph.fit(train, show=None, epochs=1)
qgraph.fit(train, show="text", epochs=1)
def plot_callback(graph, loss):
print(f"Loss {loss}")
qgraph.fit(train, show=plot_callback, epochs=1)
```

```
Examined 1457 of 1458. Best loss so far: 0.016275
Loss 0.016112560918642264
```

Note: the text `Examined n of N.`

indicates the number of `graphs`

(n) examined out of their total number (N) in the `QGraph`

.

Next, let's move onto the `graphs`

itself, and how you save them for later, get predictions and evaluate them!