Feyn Documentation

Feyn Documentation

  • Learn
  • Guides
  • Tutorials
  • API Reference
  • FAQ

›Regression

Overview

  • Tutorials

Beginner

    Classification

    • Titanic survival
    • Pulsar stars
    • Poisonous Mushrooms

    Regression

    • Airbnb prices
    • Automobile MPG
    • Concrete strength

Advanced

    Regression

    • Wine Quality

Use cases

  • Rewriting models with correlated inputs
  • Complexity-Loss Trade-Off
  • Plotting the loss graph
  • Simple linear and logistic regression
  • Deploy a model for inference

Life Sciences

    Classification

    • Detecting Liver Cancer (HCC) in Plasma
    • Classifying toxicity of antisense oligonucleotides

    Regression

    • Covid-19 RNA vaccine degradation data set
    • Preventing the Honeybee Apocalypse (QSAR)

Interfacing with R

  • Classifying toxicity of antisense oligonucleotides

Archive

  • Covid-19 vaccination RNA dataset.

Wine Quality

by: Meera Machado & Chris Cave

Feyn version: 2.1.+

Last updated: 23/09/2021

import pandas as pd

import feyn

Importing the dataset

We will run an analysis on the Wine Quality dataset from the UCI machine learning repository. We will try to predict alcohol levels from the other features.

data = pd.read_csv('../data/wine_quality.csv')
data.head()
fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH sulphates alcohol quality color
0 7.7 0.29 0.29 4.8 0.060 27.0 156.0 0.99572 3.49 0.59 10.3 6 white
1 6.2 0.47 0.19 8.3 0.029 24.0 142.0 0.99200 3.22 0.45 12.3 6 white
2 10.3 0.27 0.24 2.1 0.072 15.0 33.0 0.99560 3.22 0.66 12.8 6 red
3 6.3 0.37 0.28 6.3 0.034 45.0 152.0 0.99210 3.29 0.46 11.6 7 white
4 8.0 0.13 0.25 1.1 0.033 15.0 86.0 0.99044 2.98 0.39 11.2 8 white

Training session

random_seed = 42
# Train/test/hold out split
train, test = feyn.tools.split(data, ratio=[2, 1], random_state=random_seed)

Connecting to QLattice

# Connect to QLattice
ql = feyn.connect_qlattice()

# Reset and set a seed
ql.reset(random_seed=random_seed)

Sample and fit models using the primitive operations

Here is where the method auto_run is broken down into its primitive operations. This allows for a more customizable workflow.

# Setting semantic types
stypes = {'color': 'c'}

# Set number of epochs
n_epochs = 20

# Initialize the list of models
models = []

# Sample and fit
for epoch in range(n_epochs):
    
    # Sample models (no data here yet)
    models += ql.sample_models(
        input_names=train.columns, 
        output_name='alcohol', 
        kind='regression', 
        stypes=stypes,
        max_complexity=10
    )
    
    # Fit the models with train data
    models = feyn.fit_models(models, train, loss_function='squared_error')
    
    # Remove redundant and worst performing models
    models = feyn.prune_models(models)
    
    # Display best model of each epoch
    feyn.show_model(models[0], label=f"Epoch: {epoch}", update_display=True)
    
    # Update QLattice with the models sorted by loss
    ql.update(models)
Loss: 2.49E-01Epoch: 19alcohol linear: scale=3.450000 scale offset=0.000000 w=5.778071 bias=1.0934alcohol0outgaussian2gaussian1addadd2addadd3color categorical with 2 values bias=-0.5275color4catmultiplymultiply5density linear: scale=38.557933 scale offset=0.994699 w=0.723526 bias=0.6833density6numfixed acidity linear: scale=0.170940 scale offset=7.206195 w=-0.232311 bias=1.3469fixed ac..7numresidual sugar linear: scale=0.030675 scale offset=5.456213 w=-0.400345 bias=0.4278residual..8numpH linear: scale=1.574803 scale offset=3.218390 w=0.150648 bias=-0.3414pH9num
# Find the 10 best diverse models
best_models = feyn.get_diverse_models(models, n=10)

best_model = best_models[0]

Model inspection

Here we evaluate model performance feyn.Model.plot.

# Summary plot of the best model
best_model.plot(train, test)
alcohol linear: scale=3.450000 scale offset=0.000000 w=5.778071 bias=1.0934alcohol0outgaussian2gaussian1addadd2addadd3color categorical with 2 values bias=-0.5275color4catmultiplymultiply5density linear: scale=38.557933 scale offset=0.994699 w=0.723526 bias=0.6833density6numfixed acidity linear: scale=0.170940 scale offset=7.206195 w=-0.232311 bias=1.3469fixed ac..7numresidual sugar linear: scale=0.030675 scale offset=5.456213 w=-0.400345 bias=0.4278residual..8numpH linear: scale=1.574803 scale offset=3.218390 w=0.150648 bias=-0.3414pH9numTraining MetricsR20.821RMSE0.502MAE0.377Test0.8250.5030.376Inputscolordensityfixed aciditypHresidual sugar

Training Metrics

Test

← Concrete strengthRewriting models with correlated inputs →
  • Importing the dataset
  • Training session
    • Connecting to QLattice
    • Sample and fit models using the primitive operations
  • Model inspection

Subscribe to get news about Feyn and the QLattice.

You can opt out at any time, and you can read our privacy policy here.

Copyright © 2024 Abzu.ai - Feyn license: CC BY-NC-ND 4.0
Feyn®, QGraph®, and the QLattice® are registered trademarks of Abzu®