# Converting a graph to SymPy

by: Kevin Broløs

(Feyn version 1.4.6 or newer)

We use `SymPy`

to convert our graphs to symbolic mathematical expressions. This means that you can convert our graphs to `SymPy`

objects for further manipulation or processing, execute them as equations, print them in `LaTeX`

, or just implement the graph in any environment following the equation.

The only limitation is that we don't currently support exporting categorical registers to an executable function, so for graphs with categories, you'll only have an expression for understanding purposes.

## Let's first generate a dataset and initialize a QLattice

We'll start with generating a dataset using `sklearn`

.

```
from feyn import QLattice
from sklearn.datasets import make_classification
import pandas as pd
# Generate a dataset and put it into a dataframe
X, y = make_classification()
data = pd.DataFrame(X, columns=['abcdefghijklmnopqrstuvwxyz'[i] for i in range(X.shape[1])])
data['target'] = y
qlattice = QLattice()
```

## Fit the QGraph real quick

We won't care about train/test splits or exploring relations, so we'll just fit the `QGraph`

once, and convert it to a mathematical expression.

```
qgraph = qlattice.get_classifier(data, 'target', max_depth=1)
qgraph.fit(data)
# Take the top graph
graph = qgraph[0]
graph
```

Now that we have a graph, let's stop and reflect over what this represents.

This graph takes the inputs **f** and **m**, adds them together, and finally applies a sigmoid function on the result to force it between 0 and 1 for classification.

## Convert to SymPy

Let's see how the `SymPy`

expression looks converted to `LaTeX`

:

```
sympy_graph = graph.sympify(symbolic_lr=True)
sympy_graph.as_expr()
```

$\displaystyle \frac{1}{0.878142 e^{- 0.698784 f - 5.97639 m} + 1}$

We can very clearly see the sigmoid function represented here: $\displaystyle f(x) = \frac{1}{1 + e^{-x}}$

where $x = (0.698784 f + 5.97639 m)$. The sigmoid further has a factor of $0.878142$. The numbers can be a little tricky, since they're automatically simplified by `SymPy`

, but they're based off of the **scale**, **weight** and **bias** of the **inputs** and **outputs**.

Let's take a look at a simpler example so we can deduce this more clearly from the graph:

## Let's try it for a regressor

Eventhough it's a classification dataset, let's just try to regress on it by using a different target. Let's do feature **a**, since we know that one carries signal, due to the composition of `sklearn`

s **make_classification**.

```
qgraph = qlattice.get_regressor(data, 'a', max_depth=1)
qgraph.fit(data)
# Take the top graph
graph = qgraph[0]
graph
```

The regressor uses a linear cell as the output, so it'll look different from our classification result.

Let's see how the `SymPy`

expression looks converted to `LaTeX`

:

```
sympy_graph = graph.sympify(symbolic_lr=True)
sympy_graph.as_expr()
```

$\displaystyle 4.87371 e^{- 7.6721 \left(0.484426 - target\right)^{2} - 4.12931 \left(h + 0.343357\right)^{2}} - 0.0929986$

Again, you can see the features represented, in this case **target** and **h**. It goes into a two-legged gaussian with the formulation $e^{-(x0^2+x1^2)}$, where **x0** and **x1** each are inputs with learned linear scaling of $scale*weight*x + bias$. The output is similarly scaled, and the factor **4.87371** accounts for the weight and scale, and the bias is represented as the last term **-0.0929986**.

To get a little into where we get these weights and biases from, let's inspect the output node of the graph manually:

```
graph[-1].state._to_dict()
```

```
{'scale': 2.528051572591156,
'w': 1.9278497183870758,
'bias': -0.09299855246385323}
```

You'll see that the factor **4.87371** is equal to $scale*weight$, and the **bias** is the last term, **-0.0929986**.

The process is similar for the weights and biases on the inputs.

## What to do with the SymPy object

You can check out their documentation on how to use it if you don't already know how. It works automatically by pretty printing in unicode terminals and `IPython`

environments.

You can also use this for portability of final `Feyn`

graphs, as you don't need the `Python`

runtime to execute a simple mathematical equation. So you can take the output of these functions and port to `R`

, `STATA`

, `JavaScript`

, or even `Excel`

if you want to.

## Significant digits

Keep in mind that more complex graphs may require more significant digits to stay accurate. You can adjust that by using the **signif** parameter on the function

```
graph.sympify(signif=10, symbolic_lr=True)
```

$\displaystyle 4.873705048 e^{- 7.67210257 \left(0.4844257382 - target\right)^{2} - 4.129305744 \left(h + 0.3433571115\right)^{2}} - 0.09299855$

will return 10 significant digits for each variable, for instance.