Feyn

Feyn

  • Tutorials
  • Guides
  • API Reference
  • FAQ

›Evaluate Classifiers

Getting Started

  • Quick start
  • Using Feyn
  • Installation
  • Transition to Feyn 3.0
  • What is a QLattice?
  • Community edition
  • Commercial use

Essentials

  • Auto Run
  • Visualise a model
  • Summary plot
  • Semantic types
  • Categorical features
  • Estimating priors
  • Model parameters
  • Predicting with a model
  • Saving and loading models
  • Filtering models
  • Seeding a QLattice
  • Privacy

Evaluate Regressors

  • Regression plot
  • Residuals plot

Evaluate Classifiers

  • ROC curve
  • Confusion matrix
  • Plot probability scores

Understand Your Models

  • Plot response
  • Plot response 2D
  • Model signal
  • Segmented loss
  • Interactive flow

Primitive Operations

  • Using the primitives
  • Updating priors
  • Sample models
  • Fitting models
  • Pruning models
  • Diverse models
  • Updating a QLattice
  • Validate data

Advanced

  • Converting a model to SymPy
  • Setting themes
  • Saving a graph as an image
  • Using the query language
  • Model complexity

ROC curve

by: Chris Cave
(Feyn version 3.0 or newer)

There are many different metrics to evaluate a classification model: accuracy, precision, recall, F1 score etc. Each of these metrics require fixing a decision boundary. Above this boundary, the sample will be classified as True and below the sample will be classified as False. Typically this boundary is called a threshold and is set at 0.5.

However one can imagine situations that you want a classifier that does not predict a positive class unless it is very sure. Here you would increase the threshold from 0.5 to something much higher say 0.8. Then the classifier would only predict positives when it was nearly certain but this comes with cost that the classifier will miss some positives.

The receiver operating characteristic (ROC) captures how the classifier behaves with different thresholds. This gives a good indication where to make the best trade offs.

import feyn
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split


# Load into a pandas dataframe
breast_cancer = load_breast_cancer(as_frame=True)
data = breast_cancer.frame

# Train/test split
train, test = train_test_split(data, test_size=0.4, stratify=data['target'], random_state=666)

ql = feyn.QLattice()
models = ql.auto_run(
    data=train,
    output_name = 'target',
    kind='classification'
)

best = models[0]

ROC curve

best.plot_roc_curve(train)

ROC curve with threshold

We can plot a particular threshold on the curve and find out the metrics of the classifier (accurarcy, F1 score, Precision and recall) at this threshold

best.plot_roc_curve(train, threshold=0.5)

Saving the plot

You can save the plot using the filename parameter. The plot is saved in the current working directory unless another path specifed.

best.plot_roc_curve(data=train, filename="feyn-plot")

If the extension is not specified then it is saved as a png file.

Location in Feyn

This function can also be found in feyn.plots module.

from feyn.plots import plot_roc_curve

y_true = train['target']
y_pred = best.predict(train)

plot_roc_curve(y_true, y_pred)
← Residuals plotConfusion matrix →
  • ROC curve
  • ROC curve with threshold
  • Saving the plot
  • Location in Feyn
Copyright © 2023 Abzu.ai
Feyn®, QGraph®, and the QLattice® are registered trademarks of Abzu®