Poisonous Mushrooms
by: Chris Cave
Feyn version: 2.1+
Last updated: 27/09/2021
Importing the dataset
Here we use the QLattice
to predict whether a mushroom is poisonous or not. You can find this dataset and further descriptions of the features on UCI Machine Learning Repository.
This shows how use the QLattice
when there are a lot of categorical variables.
import pandas as pd
import numpy as np
import feyn
from sklearn.model_selection import train_test_split
data = pd.read_csv("../data/mushrooms.csv")
# Replace the categories p and e with 0 and 1 respectively
data["class"] = data['class'].replace({"p": 0, "e": 1})
data.head()
class | cap-shape | cap-surface | cap-color | bruises | odor | gill-attachment | gill-spacing | gill-size | gill-color | ... | stalk-surface-below-ring | stalk-color-above-ring | stalk-color-below-ring | veil-type | veil-color | ring-number | ring-type | spore-print-color | population | habitat | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | x | s | n | t | p | f | c | n | k | ... | s | w | w | p | w | o | p | k | s | u |
1 | 1 | x | s | y | t | a | f | c | b | k | ... | s | w | w | p | w | o | p | n | n | g |
2 | 1 | b | s | w | t | l | f | c | b | n | ... | s | w | w | p | w | o | p | n | n | m |
3 | 0 | x | y | w | t | p | f | c | n | n | ... | s | w | w | p | w | o | p | k | s | u |
4 | 1 | x | s | g | f | n | f | w | b | k | ... | s | w | w | p | w | o | e | n | a | g |
5 rows × 23 columns
QLattice
Connect to the We declare which variables are categorical to the QLattice
using the stypes
(semantic types) parameter.
random_state = 42
train, test = train_test_split(data, test_size=0.7, random_state = random_state)
stypes = {}
for col in data.columns:
if data[col].dtype == "object":
stypes[col] = "c"
ql = feyn.connect_qlattice()
ql.reset(42)
Use auto_run to obtain fitted models
models = ql.auto_run(train, "class", kind="classification", n_epochs=10, stypes=stypes, max_complexity=5)
best = models[0]
Summary plot to evaluate model
best.plot(train, test)