# Semantic types

by: Kevin Broløs & Chris Cave

(Feyn version 2.0 or newer)

There are two types of input data that `Model`

s can interpret:

- numerical, which includes:
- floating point numbers
- integers

- categorical, which includes:
- strings

- boolean, represented by
- a discrete number
`0`

or`1`

`True`

or`False`

- a discrete number

The `Model`

handles transformation of inputs, and it uses the `stype`

declarations to decide how. **Numerical** values learn a linear rescaling, **categorical** values get assigned individual weights and **boolean** inputs can only be followed by specific functions.

The **categorical** `stype`

helps maintain a simple and interpretable model by avoiding dimensional expansion. The **boolean** `stype`

helps the `Model`

converge faster by ignoring functions that don't provide meaningful transformations. You can also declare boolean variables as either **numerical** or **categorical** if you want to circumvent this behaviour.

This simplifies the preprocessing task that would fall on the data scientist. This means that you should **not** standardise inputs, **nor** should you one-hot encode categoricals.

Instead, you assign the relevant `stypes`

as shown in the example below.

```
import feyn
import numpy as np
data = {
'numerical_input': np.random.rand(4),
'categorical_input': np.array(['apple','pear','banana','orange']),
'boolean_input': np.array([True, False, False, True]),
'output_name': np.random.rand(4)
}
stypes = {
'numerical_input': 'f',
'categorical_input': 'c',
'boolean_input': 'b'
}
ql = feyn.connect_qlattice()
models = ql.auto_run(
data=data,
output_name='output_name',
stypes=stypes,
n_epochs=1
)
```

If no `stypes`

are provided for an input, it is assumed to be **numerical**.