Feyn Documentation

Feyn Documentation

  • Learn
  • Guides
  • Tutorials
  • API Reference
  • FAQ

›Primitive Operations

Getting Started

  • Quick start
  • Using Feyn
  • Installation
  • What is the QLattice?

Essentials

  • Auto Run
  • Summary plot
  • Plot response
  • Splitting a dataset
  • Seeding a QLattice
  • Predicting with a model
  • Saving and loading models
  • Categorical features

Evaluate Regressors

  • Regression plot
  • Residuals plot

Evaluate Classifiers

  • ROC curve
  • Confusion matrix
  • Plot probability scores

Understand Your Models

  • Plot response 1D
  • Plot response 2D
  • Model signal
  • Segmented loss
  • Interactive flow

Primitive Operations

  • Using the primitives
  • Updating priors
  • Sample models
  • Fitting models
  • Pruning models
  • Visualise a model
  • Diverse models
  • Updating a QLattice
  • Validate data
  • Semantic types

Advanced

  • Converting a model to SymPy
  • Logging in Feyn
  • Setting themes
  • Saving a graph as an image
  • Using the query language
  • Estimating priors
  • Filtering models
  • Model parameters
  • Model complexity

Privacy & Commercial

  • Privacy
  • Community edition
  • Commercial use
  • Transition to Feyn 3.0

Validate data

by: Kevin Broløs
(Feyn version 2.0.7 or newer)


validate_data is a function that helps discover the few common data errors that might give unwanted effects with feyn. We advise running this once after loading in your data, to ensure that your data is in good enough condition.

In order to best validate your data, you need to specify the kind of problem you intend to solve, the output column as well as the stypes that you'll use for sample_models, if any of them are categorical.

Example

from feyn.datasets import make_classification
from feyn import validate_data

train, test = make_classification()

validate_data(data=train, kind='classification', output_name='y', stypes={})

Here's an example that doesn't validate, because we're using a continuous numerical output to do a classification:

from feyn.datasets import make_regression
from feyn import validate_data

train, test = make_regression()

try:
    validate_data(data=train, kind='classification', output_name='y', stypes={})
except ValueError as e:
    print(e)
y must be an iterable of booleans or 0s and 1s

In the examples we run it for the training data, but we recommend running it for the full dataset.

validate_data will raise a ValueError in the following cases:

  • If the output column does not consist of only numerical values for a regression case.
  • If the output column does not consist boolean-like values for a classification case.
  • If any of the columns are object types, but have not been declared as categorical in stypes.
  • If columns contain NaN values, and are not declared as categorical in stypes.
    • Note: categoricals support NaN values by assigning them their own weights, so we allow this. You should still consider if that's the behaviour what you want, and handle it yourself if you don't.
← Updating a QLatticeSemantic types →
  • Example

Subscribe to get news about Feyn and the QLattice.

You can opt out at any time, and you can read our privacy policy here.

Copyright © 2024 Abzu.ai - Feyn license: CC BY-NC-ND 4.0
Feyn®, QGraph®, and the QLattice® are registered trademarks of Abzu®