Feyn

Feyn

  • Tutorials
  • Guides
  • API Reference
  • FAQ

›Essentials

Getting Started

  • Quick start
  • Using Feyn
  • Installation
  • Transition to Feyn 3.0
  • What is a QLattice?
  • Community edition
  • Commercial use

Essentials

  • Auto Run
  • Visualise a model
  • Summary plot
  • Semantic types
  • Categorical features
  • Estimating priors
  • Model parameters
  • Predicting with a model
  • Saving and loading models
  • Filtering models
  • Seeding a QLattice
  • Privacy

Evaluate Regressors

  • Regression plot
  • Residuals plot

Evaluate Classifiers

  • ROC curve
  • Confusion matrix
  • Plot probability scores

Understand Your Models

  • Plot response
  • Plot response 2D
  • Model signal
  • Segmented loss
  • Interactive flow

Primitive Operations

  • Using the primitives
  • Updating priors
  • Sample models
  • Fitting models
  • Pruning models
  • Diverse models
  • Updating a QLattice
  • Validate data

Advanced

  • Converting a model to SymPy
  • Setting themes
  • Saving a graph as an image
  • Using the query language
  • Model complexity

Categorical features

by: Chris Cave and Emil Larsen
(Feyn version 3.0 or newer)

Categorical features

A feature is categorical if there is no clear ordering in the values the feature can take. The values a categorical feature can take are called categories. Below is an example dataset containing categorical features and their categories:

CountryFavourite colourGenderSmoker/non-smoker
DenmarkRedMale0
SpainYellowFemale1
UKBlueMale1
BrazilGreenMale0
USAYellowFemale1
ItalyRedFemale0

In the example above each feature does not have an obvious ordering.

How the QLattice treats categorical features

Before we pass the dataset above through the QLattice we need to specify the semantic types of the categoricals.

import feyn
import numpy as np
import pandas as pd

data = pd.DataFrame({
    "Country": ["Denmark", "Spain", "UK", "Brazil", "USA", "Italy"],
    "Favourite colour": ["Red", "Yellow", "Blue", "Green", "Yellow", "Red"],
    "Gender": ["Male", "Female", "Male", "Male", "Female", "Female"],
    "Smoker": [0,1,1,0,1,0]
})

stypes = {
    'Country': 'c',
    'Favourite colour': 'c',
    'Gender': 'c',
    }

After which we can pass into the QLattice.

ql = feyn.QLattice(random_seed=42)
models = ql.auto_run(
    data=data,
    output_name="Smoker",
    stypes=stypes,
    n_epochs=1,
    max_complexity=2
)
model = models[0]

Here is one graph from the output of auto run.

How has the QLattice interpreted the categories of this feature as a number so it can be an input to a mathematical function?

Each category is associated to a numerical value which we call its weight. The weight gets learnt while the model is being fitted to the data.

We can see the weights associated to each category by calling params:

input_node = model[2]
print(f"category weights: {input_node.params['categories']}")
print(f"bias: {input_node.params['bias']}")
category weights: [
    ('Yellow', 0.26123558974851063),
    ('Blue', 0.26123554327444204),
    ('Red', -0.25637270012528196),
    ('Green', -0.25637278717475576)
]
bias: 0.12080737359066837

When we predict on a sample that contains a category in the list above then we first convert that category to category_weight + bias.

This value gets passed into the functions inside the model and behaves like a normal numerical value.

If there are any NaN values present for some observations in a categorical feature, then the NaN values will be interpretated as a category.

← Semantic typesEstimating priors →
  • Categorical features
  • How the QLattice treats categorical features
Copyright © 2023 Abzu.ai
Feyn®, QGraph®, and the QLattice® are registered trademarks of Abzu®