Wine Quality

Feyn version: 2.1.+

Last updated: 23/09/2021

import pandas as pd

import feyn

Importing the dataset

We will run an analysis on the Wine Quality dataset from the UCI machine learning repository. We will try to predict alcohol levels from the other features.

data = pd.read_csv('../data/wine_quality.csv')
data.head()

	fixed acidity	volatile acidity	citric acid	residual sugar	chlorides	free sulfur dioxide	total sulfur dioxide	density	pH	sulphates	alcohol	quality	color
0	7.7	0.29	0.29	4.8	0.060	27.0	156.0	0.99572	3.49	0.59	10.3	6	white
1	6.2	0.47	0.19	8.3	0.029	24.0	142.0	0.99200	3.22	0.45	12.3	6	white
2	10.3	0.27	0.24	2.1	0.072	15.0	33.0	0.99560	3.22	0.66	12.8	6	red
3	6.3	0.37	0.28	6.3	0.034	45.0	152.0	0.99210	3.29	0.46	11.6	7	white
4	8.0	0.13	0.25	1.1	0.033	15.0	86.0	0.99044	2.98	0.39	11.2	8	white

Training session

random_seed = 42

# Train/test/hold out split
train, test = feyn.tools.split(data, ratio=[2, 1], random_state=random_seed)

Connecting to QLattice

# Connect to QLattice
ql = feyn.connect_qlattice()

# Reset and set a seed
ql.reset(random_seed=random_seed)

Sample and fit models using the primitive operations

Here is where the method auto_run is broken down into its primitive operations. This allows for a more customizable workflow.

# Setting semantic types
stypes = {'color': 'c'}

# Set number of epochs
n_epochs = 20

# Initialize the list of models
models = []

# Sample and fit
for epoch in range(n_epochs):
    
    # Sample models (no data here yet)
    models += ql.sample_models(
        input_names=train.columns, 
        output_name='alcohol', 
        kind='regression', 
        stypes=stypes,
        max_complexity=10
    )
    
    # Fit the models with train data
    models = feyn.fit_models(models, train, loss_function='squared_error')
    
    # Remove redundant and worst performing models
    models = feyn.prune_models(models)
    
    # Display best model of each epoch
    feyn.show_model(models[0], label=f"Epoch: {epoch}", update_display=True)
    
    # Update QLattice with the models sorted by loss
    ql.update(models)

# Find the 10 best diverse models
best_models = feyn.get_diverse_models(models, n=10)

best_model = best_models[0]

Model inspection

Here we evaluate model performance feyn.Model.plot.

# Summary plot of the best model
best_model.plot(train, test)

Feyn Documentation

Classification

Regression

Regression

Classification

Regression

Importing the dataset

Training session

Connecting to QLattice

Sample and fit models using the primitive operations

Model inspection

Training Metrics

Test

Subscribe to get news about Feyn and the QLattice.