Feyn

Feyn

  • Tutorials
  • Guides
  • API Reference
  • FAQ

›Primitive Operations

Getting Started

  • Quick start
  • Using Feyn
  • Installation
  • Transition to Feyn 3.0
  • What is a QLattice?
  • Community edition
  • Commercial use

Essentials

  • Auto Run
  • Visualise a model
  • Summary plot
  • Semantic types
  • Categorical features
  • Estimating priors
  • Model parameters
  • Predicting with a model
  • Saving and loading models
  • Filtering models
  • Seeding a QLattice
  • Privacy

Evaluate Regressors

  • Regression plot
  • Residuals plot

Evaluate Classifiers

  • ROC curve
  • Confusion matrix
  • Plot probability scores

Understand Your Models

  • Plot response
  • Plot response 2D
  • Model signal
  • Segmented loss
  • Interactive flow

Primitive Operations

  • Using the primitives
  • Updating priors
  • Sample models
  • Fitting models
  • Pruning models
  • Diverse models
  • Updating a QLattice
  • Validate data

Advanced

  • Converting a model to SymPy
  • Setting themes
  • Saving a graph as an image
  • Using the query language
  • Model complexity

Validate data

by: Kevin Broløs
(Feyn version 2.0.7 or newer)

validate_data is a function that helps discover the few common data errors that might give unwanted effects with feyn. We advise running this once after loading in your data, to ensure that your data is in good enough condition.

In order to best validate your data, you need to specify the kind of problem you intend to solve, the output column as well as the stypes that you'll use for sample_models, if any of them are categorical.

Here's an example that validates:

from feyn.datasets import make_classification
from feyn import validate_data

train, test = make_classification()

validate_data(data=train, kind='classification', output_name='y', stypes={})

Here's an example that doesn't validate, because we're using a continuous numerical output to do a classification:

from feyn.datasets import make_regression
from feyn import validate_data

train, test = make_regression()

try:
    validate_data(data=train, kind='classification', output_name='y', stypes={})
except ValueError as e:
    print(e)
y must be an iterable of booleans or 0s and 1s

In the examples we run it for the training data, but we recommend running it for the full dataset.

validate_data will raise a ValueError in the following cases:

  • If the output column does not consist of only numerical values for a regression case.
  • If the output column does not consist boolean-like values for a classification case.
  • If any of the columns are object types, but have not been declared as categorical in stypes.
  • If columns contain NaN values, and are not declared as categorical in stypes.
    • Note: categoricals support NaN values by assigning them their own weights, so we allow this. You should still consider if that's the behaviour what you want, and handle it yourself if you don't.
← Updating a QLatticeConverting a model to SymPy →
Copyright © 2023 Abzu.ai
Feyn®, QGraph®, and the QLattice® are registered trademarks of Abzu®