# Goodness of fit

by: Chris Cave

(Feyn version 1.5.4 or newer)

`Feyn`

offers a range of tools to help you dissect your graph and its dynamics.

As sample data we are going for the boston housing price prediction dataset from sklearn where we are predicting median house prices of different areas around Boston. Below I import data, prepare it and find my graph of choice with my QLattice:

```
from sklearn.datasets import load_boston
import pandas as pd
from feyn import QLattice
from feyn.tools import split
#Download boston housing dataset
boston = load_boston()
df_boston = pd.DataFrame(boston.data, columns=boston.feature_names)
df_boston['PRICE'] = boston.target
# Train/test split
train, test = split(df_boston)
# Connect to QLattice
ql = QLattice()
ql.reset()
# Get a regressor
qgraph = ql.get_regressor(train, 'PRICE', max_depth = 2) #max_depth = 2, let's not overdo it
qgraph.fit(train)
# Select a graph from your fitted QGraph
best_graph = qgraph[0]
```

## Goodness of fit

As I have a regressor, I would like to compare the true values of my target variable with my predicted values. The code below plots tuples: on the x-axis is the true values of the target variable and on the y-axis is the predicted values. If the prediction is perfect then all the points should lie on the `y=x`

dashed line. I can use this to see whether I overestimate or underestimate certain regions.

The line of best fit is an aid to see just how close the points are to the line of equality.

```
best_graph.plot_goodness_of_fit(data=train)
```