Estimating priors · Feyn Documentation

by: Kevin Broløs and Emil Larsen
(Feyn version 3.0 or newer)

Let's show you how to estimate and update the priors before sampling models.

When using Auto Run, the default behaviour is to estimate the priors by using the function feyn.tools.estimate_priors.

This provides an efficient means of initial feature selection, and will increase predictive performance and reduce time to convergence of the QLattice in a majority of cases. It works particularly well for wide data sets with many inputs.

Example

Here's an example on how to use estimate_priors before manually sampling models from a QLattice to get a similar effect:

import feyn
from feyn.datasets import make_classification

ql = feyn.QLattice()

train, test = make_classification(random_state=42)
output_name = "y"

priors = feyn.tools.estimate_priors(train, output_name, floor=0.1)
ql.update_priors(priors)

new_sample = ql.sample_models(
    train,
    output_name,
    'classification'
)

Note that while this example stops at sampling, you only need to estimate the priors once if you're running a sample-fit-update loop as described in Using the primitives.

If we print out the value of the priors variable, we get the following map of input names with their relative weights in the range [0, 1]:

{
 'x0': 0.99,
 'x1': 0.97,
 'x2': 0.98,
 'x3': 0.88,
 'x4': 0.83,
 'x5': 0.94,
 'x6': 0.96,
 'x7': 0.81,
 'x8': 0.84,
 'x9': 0.87,
 'x10': 0.89,
 'x11': 1.0,
 'x12': 0.86,
 'x13': 0.95,
 'x14': 0.85,
 'x15': 0.9299999999999999,
 'x16': 0.92,
 'x17': 0.8200000000000001,
 'x18': 0.91,
 'x19': 0.9
}

Higher values increases the initial likelihoods of sampling the corresponding input from the QLattice. During training of the QLattice the probability of sampling the different input variables will change as usual.

You don't have to use this particular prior estimation function. As shown in Updating priors, we can supply any priors we like - or none at all.

Parameters of `feyn.tools.estimate_priors`

data

The data the prior probabilities will be computed on.

Note: Make sure to only compute the priors based on the train set and not the entire data set. Otherwise information will leak from the test set and will bias the test error.

output_name

The name of the output (target) variable in the data set.

floor

Default: 0.1.

The minimum value permitted for any prior computed. If the computed prior of an input is below the floor value, it will be clamped to the floor value.

Note: If you allow this floor to be 0, inputs with 0 probability will not appear in models sampled by the QLattice during training after Updating priors.

Example

Parameters of feyn.tools.estimate_priors

data

output_name

floor

Subscribe to get news about Feyn and the QLattice.

Parameters of `feyn.tools.estimate_priors`