Using the primitives
by: Kevin Broløs & Chris Cave
(Feyn version 3.0 or newer)
Feyn
consists of primitive operations that you can compose in a multitude of ways to create your own simulations.
Here we show how all the primitive operations fit together to create a QLattice
simulation similar to the auto_run
function. The auto_run
serves most needs, but this is available for those that want to customize a simulation.
Example
To give an example, we first create a synthetic dataset using make_classification
, and validate that it's fit for use with feyn
.
import feyn
from feyn.datasets import make_classification
train, test = make_classification()
output_name = 'y'
# Optional, but helpful. Is best on the full dataset, not just train.
feyn.validate_data(data=train, kind='classification', output_name=output_name, stypes={})
Composing primitive operations
Below is an example of the basic composition of these primitive operations, however you can modify any part of this as you like.
# Instantiate a QLattice
ql = feyn.QLattice()
# Initate an empty list of models
models = []
# Define how many epochs to run the simulation for
n_epochs = 10
# Compute prior probability of inputs based on mutual information
priors = feyn.tools.estimate_priors(train, output_name)
# Update the QLattice with priors
ql.update_priors(priors)
for epoch in range(n_epochs):
# Sample models from the QLattice, and add them to the list
models += ql.sample_models(train.columns, 'y', 'classification')
# Fit the list of models. Returns a list of models sorted by loss or criterion.
models = feyn.fit_models(models, train, 'binary_cross_entropy')
# Remove redundant and poorly performing models from the list
models = feyn.prune_models(models)
# Display the best model in the current epoch
feyn.show_model(models[0], label=f"Epoch: {epoch}", update_display=True)
# Update QLattice with the fitted list of models (sorted by loss)
ql.update(models)
# Find the 10 best and sufficiently diverse models
best_models = feyn.get_diverse_models(models, n=10)
best_models[0].plot(train, test)
All operations are performed locally and no external communication takes place.
The label
parameter in feyn.show_model
takes any string - the label in auto_run
records more information. You can read about that here
In the next sections, we'll go through each of the steps that make up this simulation in depth.
Expanding auto_run
The above is an approximation of the workflow, but of course auto_run
has more sophisticated handling for default variables and edge cases. Most of that is not useful for user workflows.
However, if you are in a jupyter
notebook context, you can use an experimental
feature to spawn a new code cell with all the internals of auto_run
that you can customize on your own. It takes all the same parameters that go into auto_run
to make the code executable from the get-go:
ql.expand_auto_run(train, 'y', kind='classification', n_epochs=n_epochs, stypes = {})
Example output cell, heavily abbreviated for example purposes to show its likeness to the above:
# Dependencies
import feyn
# Parameters
ql = feyn.QLattice(random_seed=42)
data = train
output_name = "y"
kind = "classification"
n_epochs = n_epochs
stypes = {}
# Abbreviation.
# (...)
# auto_run code expansion:
feyn.validate_data(data, kind, output_name, stypes)
# (...)
models = []
priors = feyn.tools.estimate_priors(data, output_name)
ql.update_priors(priors)
for epoch in range(1, n_epochs + 1):
new_sample = ql.sample_models(data, output_name, kind, stypes) # (...)
models += new_sample
# (...)
models = feyn.fit_models(models, data=data) # (...)
models = feyn.prune_models(models)
# (...)
if len(models) > 0:
feyn.show_model(models[0], feyn.tools.get_progress_label(epoch, n_epochs), update_display=True) # (...)
ql.update(models)
best = feyn.get_diverse_models(models)
# (...)