Plot probability scores
by: Chris Cave
(Feyn version 3.0 or newer)
The output of a binary classifier can be interpreted as the predicted probability of a sample belonging to the positive class. It is useful to see the distribution of scores to evaluate the quality of the classifier.
import feyn
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
# Load into a pandas dataframe
breast_cancer = load_breast_cancer(as_frame=True)
data = breast_cancer.frame
# Train/test split
train, test = train_test_split(data, test_size=0.4, stratify=data['target'], random_state=666)
# Instantiate a QLattice
ql = feyn.QLattice()
models = ql.auto_run(train, 'target', 'classification')
best = models[0]
The function plot_probability_scores
plots a histogram of the probabilities of the passed dataset.
best.plot_probability_scores(test)
The different colours highlight the true classes of the samples. A good classifier would have a clear separation of the negative and positive class.
Saving the plot
You can save the plot using the filename
parameter. The plot is saved in the current working directory unless another path specifed.
best.plot_probability_scores(data=train, filename="feyn-plot")
If the extension is not specified then it is saved as a png file.
Feyn
Location in This function can also be found in feyn.plots
module.
from feyn.plots import plot_probability_scores
y_true = train['PRICE']
y_pred = best.predict(train)
plot_probability_scores(y_true, y_pred)