Feyn

Feyn

  • Tutorials
  • Guides
  • API Reference
  • FAQ

›Essentials

Getting Started

  • Quick Start

Using Feyn

  • Introduction to the basic workflow
  • Asking the right questions
  • Formulate hypotheses
  • Analysing and selecting hypotheses
  • What comes next?

Essentials

  • Defining input features
  • Classifiers and Regressors
  • Filtering a QGraph
  • Predicting with a graph
  • Saving and loading graphs
  • Updating your QLattice

Plotting

  • Graph summary
  • Partial plots
  • Segmented loss
  • Goodness of fit
  • Residuals plot

Setting up the QLattice

  • Installation
  • Accessing your QLattice
  • Firewalls and proxies
  • QLattice dashboard

Advanced

  • Causal estimation
  • Converting a graph to SymPy
  • Feature importance estimation
  • Setting themes
  • Saving a graph as an image
  • Tuning the fitting process

Future

  • Future package
  • Diagnostics
  • Inspection
  • Reference
  • Stats
  • Plots

Classifiers and Regressors

by: Kevin Broløs
(Feyn version 1.4 or newer)

The QGraph

Before getting a classifier or regressor, it's important to understand how these are defined in the context of the QLattice.

Essentially, the QGraph is the representation of the infinite ordered list of graphs (or paths) conceivable through the QLattice from your input features to your output feature, considering all possible combinations of interactions.

It's easy to imagine how this combinatorically explodes as interactions happen between both the features and the transformations of features. If you think about if for a bit, you realize that there are really infinitely many such graphs. The QGraph is there to help you search through this infinite list and find the best graph it possibly can in that list.

What you need to know though, is what it represents, and how you interact with it.

The QGraph is essentially a testbed for all those weird ideas you had when you did your data analysis and considered what to use and not to use, except it does all that for you, and also comes with alternate suggestions. Initially, these suggestions will be pretty weird, but a few gems will emerge. Through updating the QLattice as you evaluate all these suggestions, the search space narrows and more of these suggestions become relevant.

We showcase how to use this idea at length in our section on formulating hypotheses

First things first: data

First we'll get a dataset. Let's just generate one using sklearn.

from sklearn.datasets import make_classification
import pandas as pd

from feyn.tools import split
# Generate a dataset and put it into a dataframe
X, y = make_classification()
data = pd.DataFrame(X, columns=[str(i) for i in range(X.shape[1])])
data['target'] = y

# Split into a train and test set
train, test = split(data, ratio=(0.75, 0.25))

Get a QGraph instance

Now that we have this dataset, we can let the QLattice know that we'd like a classifier using those inputs, and a given output. The exact same method applies for getting a regressor - only the target function changes.

The QLattice is a generator of graphs from input to output. A QGraph is an unbounded list of graphs that have been generated from the QLattice. When we extract a QGraph, we only need to do it once, and declare what features we want to use and what should be the output variable.

The following example shows that we use all the columns as input and the target column as the output.

We're assuming that you've gone through and set up your QLattice with an configuration file.

from feyn import QLattice
qlattice = QLattice()

# This will extract a QGraph containing classifiers
qgraph = qlattice.get_classifier(data.columns, 'target')

Or for regression:

# This will also work, but will treat it as a regression problem.
qgraph = qlattice.get_regressor(data.columns, 'target')

You still haven't input any of the data yet, and that's because your QLattice doesn't work with data, it works with concepts.

Fitting the qgraph

Fitting a QGraph is as simple as calling qgraph.fit(data), but we often want to do this multiple times. The reason for that, is that every time we call .fit, the QGraph discards the worst graphs and gives you a new evolution of graphs based on what the QLattice has learnt from you. This means that you should also update the QLattice as you go along with your best graphs. Doing so allows the QLattice to hone in on your problem space and keep suggesting things that are useful, rather than just random stuff.

We refer to this process as the update loop.

An example update loop could look like this, but can be as sophisticated as you want it to (think cross validation, ensemble solutions, only your imagination is the limit...)

# Let's go back to our classifier for this example
qgraph = qlattice.get_classifier(data.columns, 'target')

n_loops = 10
for _ in range(n_loops):
    # Fit the QGraph with your local data
    # Note: This automatically fetches a new evolution of graphs, while keeping the best ones you already have!
    qgraph.fit(train)

    # The top graphs that have evolved independantly from each other in the QLattice
    best_graphs = qgraph.best()

    # Feed these graphs back to the QLattice. The next fit will explore graphs that are more similar to it
    qlattice.update(best_graphs)

Inspecting the QGraph

To get and render a selection of graphs from a QGraph, call the head function.

In a Jupyter or IPython environment, this will render the graphs.

qgraph.head(n=3)

qgraph.head1 qgraph.head2 qgraph.head3

You can also render each individual graph by indexing the QGraph.

qgraph[3]

qgraph.head3

Plotting during fitting

By now you've probably also noticed that if you run in an IPython environment, the QGraph.fit() will display the current best (lowest loss) graph while it's training.

You can change that behaviour using the show parameter, disabling it entirely with None, displaying text printouts with text or adding in your own callback function to plot the metrics that matter the most to you.

# Examples
qgraph.fit(train, show=None)
qgraph.fit(train, show="text")

def plot_callback(graph, loss):
    print(f"Loss {loss}")

qgraph.fit(train, show=plot_callback)
Loss 0.016112560918642264
← Defining input featuresFiltering a QGraph →
  • The QGraph
  • First things first: data
  • Get a QGraph instance
  • Fitting the qgraph
  • Inspecting the QGraph
  • Plotting during fitting
Copyright © 2021 Abzu.ai
Feyn®, QGraph®, and the QLattice® are registered trademarks of Abzu®