# feyn.__future__.contrib.stats

```
```

*function* graph_f_score

```
def graph_f_score(
graph,
data
)
```

```
This computes the F-statistic associated to a feyn graph under the null hypothesis.
The null hypothesis is that every weight on each feature and category is equal to zero.
If the hypothesis is true then the F-score is distributed by F(q, n - p),
the Fisher distribution of q and n-p degrees of freedom. Here:
* q is the amount of weights we assume is equal to zero
* n is the amount of samples in data
* p amount of parameters in the graph. The F score is calculated by:
nom = {sum((data[target].mean - data[target])**2) - (graph.mse(data) * n)} * (n-p)
denom = (graph.mse(data) * n) * q
F = nom / denom
Arguments:
graph {[feyn.Graph]} -- Graph to test null hypothesis.
data {[dic of numpy arrays or pandas dataframe]} -- Data to test significance of graph on.
Returns:
tuple -- The F score of hypothesis and p value
```

*function* graph_g_score

```
def graph_g_score(
graph,
data
)
```

```
This computes the G-statistic associated to a feyn graph under the null hypothesis.
The null hypothesis is that every weight on each feature and category is equal to zero.
If the hypothesis is true then the G-score is distributed by chi2(q),
with q degrees of freedom. Here:
* q is the amount of weights we assume is equal to zero
The G-statistic is calculated by:
G = 2 * {graph_log_likelihood(graph, data) - log-likelihood of constant model}
where
log-likelihood of constant model = #neg_class * np.log(#neg_class) + #pos_class * np.log(#pos_class) - #samples * np.log(#samples)
Arguments:
graph {[feyn.Graph]} -- Graph to test null hypothesis.
data {[dic of numpy arrays or pandas dataframe]} -- Data to test significance of graph on.
Returns:
tuple -- The F score of hypothesis and p value
```

*function* graph_log_likelihood

```
def graph_log_likelihood(
graph,
data
)
```

```
This computes the log-likelihood of the graph evaluated on the data set.
Arguments:
graph {[feyn.Graph]} -- Graph to evaluate log-likelihood.
data {[dic of numpy arrays or pandas dataframe]} -- Data to evaluate the log-likelihood on.
Returns:
[scalar] -- The log-likelihood of the graph on the data set.
```

*function* plot_graph_p_value

```
def plot_graph_p_value(
graph,
data,
title='Significance of graph',
ax=None
)
```

```
Plots the probability density function under the null hypothesis.
The null hypothesis is that every weight on each feature and category is equal to zero.
If the graph is a regression then this plots the Fisher distribution
Under the null hypothesis the F-score approximately distributed by F(q, n - p),
with q and n-p degrees of freedom. Here:
* q is the amount of weights we assume is equal to zero
* n is the amount of samples in data
* p amount of parameters in the graph.
If the graph is a classification then this plots the chi2 distribution
Under the null hypothesis the G-score is distributed by chi2(q),
with q degrees of freedom. Here:
* q is the amount of weights we assume is equal to zero
This also plots vertical lines intercepting the x-axis at the F scores or G scores under each hypothesis.
Arguments:
graph {[feyn.Graph]} -- Graph to calculate p-values of under the null hypothesis
data {[dic of numpy arrays or pandas dataframe]} -- Data to test significance of graph on.
Keyword Arguments:
title {str} -- [Title of axes] (default: {'Significance of graph'})
ax {[matplotlib.Axes]} -- (default: {None})
Returns:
[matplotlibe.Axes] -- Plots of distributions under null hypothesis
```