Feyn Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[3.4.0] - 2024-06-04
Added
auto_runandsample_modelsnow automatically determines the kind of model to train based on the stypes ifkind=auto. If the output does not have an stype, it falls back to previous default behaviour ofkind=regression.- Dataframes where the output column has the dtype of
boolare now supported without first converting the type tointorfloat. - Added function
flip_cmapto Theme, which reverses a specified colormap belonging to a Theme. - Added shorthand function
flip_diverging_cmapto Theme, which reverses thefeyn-divergingcmap used by default inplot_response_2dandplot_probability_scores. - Added new colormap -
feyn-signal- which is used in the signal plots and other 0-centered rendering that need a colorbar. It defaults to the same gradient asfeyn-divergingbut is not affected by the flip function above.
Changed
- The limits for when
infer_stypesconsiders a numerical series likely to be ordinal or nominal have been tightened to make it more conservative in its attribution. - Numerical binary input columns are now auto-assigned stypes
numericalrather thancategoricalwhen usinginfer_stypes. - Model graph representations now read
linearorlogisticon the output node instead ofout. - Fixed the minimum dependency on matplotlib to 3.7.x.
- Theme is now accessible from the top-level package.
plot_probability_scoresnow uses thefeyn-divergingcmap boundary colors to plot the negative and positive classes instead of the theme's color cycler.- Increased the size of the markers in
plot_response_2d, and styled them to be easier to distinguish from the background. plot_probability_scoresnow uses the theme-specific edge color for higher contrast for the bars.
Fixed
- Fixed a bug where removing variables from a dataframe could result in an error during fitting if the
QLatticeinstance had already been used to fit on the full data. - Fixed a bug where the axis labels for the 2D response plot would not always be vertical when they needed to be.
expand_auto_runnow uses the same random seed as provided to the QLattice, if any.plot_response_1dis now able to compute for models with boolean input features without requiring the user to cast tointon their own.Theme.cycler(idx)now correctly cycles if you overflow the length of the cycler.- Reference models now contain a
kindattribute, allowing their use with plots that have kind restrictions again.
[3.3.1] - 2024-04-24
Fixed
- Fixed an issue causing
plot_model_response_autoto fail when fixing categorical inputs.
[3.3.0] - 2024-04-22
Added
- Added a new function -
plot_model_response_auto. This plot inspects the model and automatically determines which inputs to display in the plot and which to fix. It also chooses the appropriate response plot to display.- This function can also be accessed directly on the model instance as
plot_response_auto.
- This function can also be accessed directly on the model instance as
- Added support for the upcoming release of
numpy2.0.0 based on the api of release candidate 1. You can read more about the coming changes tonumpyhere
Fixed
- Fixed a typo in the ax label for
model.plot_probabilities. - Fixed an issue where using the
int16dtype would result in aValueError: data contains nan or infinite values. Theint16dtype is now properly accepted. Future errors for unsupported types will now reflect that a type in the data is unsupported.
[3.2.0] - 2024-02-26
Added
auto_runwill now try to automatically infer thestypesbased on the dataset if nostypesare provided as argument. It will additionally produce warnings for some data types and skip training on redundant or unsupported data types like ID columns and dates.feyn.tools.infer_stypesas used byauto_runis accessible to use on its own.stype:skiphas been added as an internal meta type to indicate columns to skip during sampling. This behaviour is internal and is subject to change, and we recommend removing the columns from the dataframe instead of relying on this.
plot_probability_scoresnow supports adding custom legend labels and legend positioning. The legend can also be turned off by specifyinglegend_loc=None. It also now has a title.plot_segmented_lossnow supports adding custom legend labels and legend positioning. The legend can also be turned off by specifyinglegend_loc=None. It also now has a title.plot_model_response_2dnow has a legend that explains the scatter and displays the fixed values. It also now supports specifying the cmap to use.
Changed
- When using plots and metrics that only work for classifiers on the model instance, they will now return a TypeError if the model is not a classifier. Functions changed are
model.plot_roc_curve,model.plot_pr_curve,model.plot_probability_scoresandmodel.plot_confusion_matrix, along with metricsmodel.roc_auc_score,model.accuracy_scoreandmodel.accuracy_threshold
Fixed
- Fixed an issue with some input types not working in the experimental interactive response_1d plot.
plot_probability_scoresnow uses the proper theme colors for negative and positive classes.- Fixed a bug when using non-protected functions where the exponential function would result in an error.
[3.1.0] - 2024-01-22
Added
feyn.tools.splitnow supports stratification on one or more columns using thestratifyparameter. It will additionally raise an exception if the chosen splits are not supported and produce warnings especially on small datasets if the ratios deviate significantly from the expectations.
Changed
- Deprecated function
connect_qlatticeis no longer included in * imports. It will be removed in a future major release. - Methods that plot in SVG, such as
model.savefig,plot_signalandplot_flownow all consistently return an SVG wrapper that can be saved to file and is displayable in Jupyter environments. - Methods that would call
printorwarnings.warnnow use appropriate python logging instead. It will honor existing logging configurations, however if the logging framework is not configured prior to importing feyn (like in Jupyter notebook use cases), it will register a handler to stdout and default to log messages at the INFO level. - Various functions that had support for dicts of numpy arrays rather than pandas DataFrames now produce a deprecation warning if you use them like that:
feyn.tools.splitfeyn.metrics.get_pearson_correlationfeyn.metrics.get_spearmans_correlationsfeyn.metrics.get_mutual_informationfeyn.metrics.segmented_loss
- More informative error messages are added to some functions if there's a mismatch between the provided DataFrame and model:
feyn.plots.plot_model_summaryfeyn.plots.plot_model_signalfeyn.plots.plot_segmented_lossfeyn.plots.plot_model_response_2dfeyn.plots.plot_activation_flowfeyn.metrics.get_pearson_correlationfeyn.metrics.get_spearmans_correlationsfeyn.metrics.get_mutual_informationfeyn.metrics.segmented_lossfeyn.metrics.get_summary_information
Fixed
- Methods that plot in SVG and HTML, such as
model.plot,savefigplot_signalandplot_flownow all have consistent save behaviour. - Various internal modules and functions now have proper internal names for a more consistent public API.
- Various public methods that were missing docstrings now have docstrings.
Removed
- We've removed typing_extensions as a dependency, as it is no longer needed.
[3.0.6] - 2023-11-27
Added
- Add support for Python 3.12
Fixed
- Support for macOS should now be more consistent across python versions, especially regarding M1 builds for macOS 11 and above.
- Extended the support for fitting on DataFrames with ExtensionArray and StringArray types. Note, that some extension types may incur an extra performance cost at this time if they're not numpy-based (like PyArrow backed arrays).
- The tick markers for plot_response_2d are now more well-behaved for very small values
[3.0.5] - 2023-01-03
Fixed
- Fixed weird error from having an input name that contains a column, eg
weird:name. The user is now directed to rename their input columns. - Fixed slow computation for mutual information on wide datasets.
- Fixed inability to fit on DataFrames with underlying StringArray containers.
Changed
- Add support for Python 3.11
[3.0.4] - 2022-09-27
Fixed
- Fixed matploblib dependency bug introduced with latest release. Now Feyn requires matplotlib >= 3.6.0.
- Fixed loading old serialized models. Introduced migration logic which notifies user to update their models before next release.
[3.0.3] - 2022-09-22
Fixed
- Fixed bug with using boolean variables as input features.
Changed
- Droped support for Python-3.7.
[3.0.2] - 2022-06-28
Fixed
- When sample_weights are used to fit models then
feyn.Model.loss_valuecomputes the weighted mean loss. That means that weighted mean loss is used again for sorting models whensample_weightsargument is passed tofeyn.fit_models()orfeyn.QLattice.auto_run().
[3.0.1] - 2022-04-27
Changed
- Use of
feynis now fully local, using an improved and lightweight version of theQLatticebundled withinfeyn. - You should now use
feyn.QLattice()to get aQLattice. Notice that the previous functionsfeyn.connect_qlattice()andql.reset()are now deprecated. They have been updated to match the new workflow to ensure previous code still works as intended, but will be removed in future releases offeyn.
Fixed
plot_response_1don a single-feature model no longer requires thebyargument.plot_activation_flownow warns the user when more than one sample is given.- Updated outdated LICENSE file
[2.1.5] - 2022-03-25
Changed
sklearnandsympyis now a dependency offeyn.
Fixed
plot_roc_curvenow correctly computes for large datasets.- Minor performance improvements to the training.
[2.1.4] - 2022-01-13
Fixed
- Release windows versions missing in release 2.1.3.
[2.1.3] - 2022-01-11
Changed
- Build for python 3.10 for all major platforms.
- Drop support for python 3.6.
Fixed
- Fixed an issue with
np.float128type not supported on windows machines. Defaults tonumpy.longdouble.
[2.1.2] - 2021-12-10
Added
feyn.referencemodels now perform automatic categorical preprocessing. This is done using the optionalstypesparameter in the constructor.
Changed
feyn.plots.plot_model_response_2dnow has a fixed scale for classification to make it easier to read and compare.
Fixed
- Fixed an issue where
feyn.tools.estimate_priorswould incorrectly rank the priors in some instances. - The type checker now allows all numerical numpy dtypes in place of "float" and all integer types for "int", so you no longer have to upcast your dataframes.
- Input name truncation in the model plots now don't truncate if the length would result in a same-length or longer string.
[2.1.1] - 2021-10-27
Added
- Sympify:
feyn.tools.sympify()(alsomodel.sympify()) now has parametersymbolic_catthat allows you to expand categories to their linear components to make it easier to port or evaluate a sympy expression.feyn.tools.get_sympy_substitutionsfor easily converting a model and a sample to a substitution dictionary for use with<sympy_expr>.evalf(subs=...)with the new category changes.
- Exposed convenience functions used by
auto_runto make it more composable by the user:feyn.tools.infer_available_threadstries to guess how many threads you can use. Returns maximum - 1.feyn.tools.get_progress_labelgives a label for use withmodel.showthat displays epoch count and an estimated time to finish (if elapsed seconds are given).
- Experimental feature for IPython environments only:
ql.expand_auto_runis a function that takes all the same parameters asauto_runand creates a runnable code cell with the same code that makes it easier for you to fold out anauto_runloop to its primitive components. Note that this contains all of the error handling, general checking and safety measures ofauto_runthat you might not need, but could serve as a starting point. - Added
model.plot_pr_curve()to plot precision-recall curves. - Added
feyn.tools.estimate_priorsfunction for computing prior probabilities of inputs based on mutual information. - Added
ql.update_priorsfunction primitive for updating the QLattice with prior probabilities of inputs.
Changed
feyn.get_diverse_modelsis now more likely to return up to n models, and has improved runtime.feyn.tools.get_model_parametersreturns a slightly different structure for categories now. The category is the index, and the weight columns are now named as{input_name}\[_ix\]}. This makes it easier to relate to the specific places in the model, for example for use in substituting values in the sympy expression.model.inputsnow returns a unique list of input names.- The target parameter of models from
feyn.referencehas been renamedoutput_name.
Fixed
- Sympify model bugs:
- Numerical input names now work correctly again.
- Categorical input names are now identifiable if multiple inputs exist in a model with the same name.
- Input names with sympy reserved characters should be better supported now (the characters get replaced).
- You can now install optional dependencies using
pip install feyn[extras] - Fix display of inputs in models that would sometimes result in hard to copy input names.
- Models with categorical registers should now be reproducible with seed and same version of feyn/qlattice.
[2.1.0] - 2021-09-27
Added
- Add
feyn.get_diverse_models. User facing function to get diverse models given their lineage in the QLattice. - Add
feyn.tools.get_model_parameters(alsomodel.get_parameters). User can extract the parameters of a certain feature in the model as a pandas DataFrame. - Add option to save various plots:
plot_model_summary(alsomodel.plot) plot as an html file.- plots in the majority of plotting functions.
- Add option to change the figsize of plots in the majority of plotting functions.
- Add more useful display of inputs in model when they share a common beginning.
Changed
- New QLattice algorithm that vastly improves performance and has better traversal of the search space:
QLattice.updatenow expects you to give all themodelsyou have - sorted by your metric of choice - and it will figure out how best to update on its own.- New format for saving and loading models. Breaks backwards compatibility with previously saved models.
feyn.prune_modelsno longer supportsdropoutanddecayparameters.
plot_model_summary(alsomodel.plot):- Is intended to plot the most useful metrics and report for your final model, and has been changed to reflect this.
- Now displays a table of the inputs used in the model.
plot_model_signal(alsomodel.plot_signal) no longer plots the summary metrics, and the arguments have changed to reflect this.plot_partial_2dnow has a deprecation warning, useplot_model_response_2d(model.plot_response_2d) instead.auto_run:- Now sorts best models based on the Bayesian information criterion (BIC) by default rather than loss.
- No longer returns multiple models with identical mathematical expressions. This means that any number between one and ten models will be returned.
- Now estimates a time to completion.
- Starting models are now copied before being added to the fitting loop, meaning that your original models are left unchanged.
Removed
- Remove
feyn.best_diverse_models. Replaced byfeyn.get_diverse_modelswhich has similar functionality. - Remove semantic type "bool" which previously only used functions add, multiply, gaussian2. We now recommend using the numerical semantic type instead.
- Remove deprecated
plot_partial. This functionality is covered byplot_model_response_1d.
Fixed
auto_runhas been reprimanded and will now properly honor your selected amount ofthreads.- The query language can now be used in all cases - there is no longer a limitation to which queries will be feasible.
[2.0.7] - 2021-08-20
Added
- More informative error messages and type checks for all top-level functions, such as:
- Primitives:
ql.auto_run,ql.sample_models,feyn.sample_models,feyn.fit_models,feyn.prune_models,feyn.best_diverse_models,ql.update,
- Plotting:
feyn.plots.plot_partial2d,feyn.plots.plot_roc_curve,feyn.plots.plot_probability_scores,feyn.plots.plot_activation_flow,feyn.plots.plot_model_summary,feyn.plots.plot_model_response_1d,feyn.plots.plot_residuals,feyn.plots.plot_regression
- Primitives:
Model.predictcan now acceptpd.Series.- New function added called
Model.plot_signal. This displays signal flow through model. This is the previous behaviour ofModel.plot. - New function
make_regressionandmake_classificationfunctions available infeyn.datasetsmodule. These are wrappers of the sklearn functions insklearn.datasets. - Feyn now has a native package for Mac with Apple silicon (M1 chip) for Python 3.8 and 3.9
Changed
plot_probability_scoresparameterh_kwargshas been replaced with**kwargs. Now you can pass histogram kwargs as keyword arguments and not a dictionaryModel.plotnow displays performance figures underneath the model signal. For regressors these are:plot_regression,plot_residuals. For classifiers these are:plot_roc_curve,plot_confusion_matrix.
Removed
ql.snapshotshave been removed and theQLatticeno longer has backup and restore functionality.
[2.0.4] - 2021-07-07
Changed
plot_partialis now calledplot_response_1d(feyn.plots.plot_model_response_1d)- The named argument
fixedis now calledinput_constraints plot_partialcan still be called but now raises a FutureWarning about being deprecated.
- The named argument
[2.0.1] - 2021-06-29
Changed
model.plot(feyn.plots.plot_model_summary) has been updated:- The named argument
testis now calledcompare_data - Now supports
labelsparam for custom labels for the summary metrics - Now supports a list of
compare data, if you want to compare multiple things - Any additional metric added will be a more condensed column to make it easier to compare with the primary metrics
- The named argument
Fixed
- Model graphs being cut off by jupyter's viewport in some instances should now be fixed by automatic rescaling to fit in view.
[2.0.0] - 2021-06-18
Changed
Graphis now calledModelThe
QGraphno longer exists. Instead you now work on lists ofModels.No longer directly instantiate a
QLattice. Instead you callfeyn.connect_qlattice().The common
feynworkflow is now contained in one function calledauto_runwhich lives on a connectedQLattice.In addition to the above automatic workflow, we now have a more expressive set of functions to replace the old:
- A list of
Models can now be sampled from a connectedQLatticeusingsample_models. This replaces part of the behaviour of the previousget_regressorandget_classifieron theQGraph. - You now fit a list of
Models usingfeyn.fit_models. This takes a list ofModels as an input and returns a list of fittedModels. This replaces part of the previous fit step on theQGraph. - You now prune the worst
Models from a list usingprune_models. This replaces part of the previous fit step on theQGraph. Default behaviour removes duplicateModels, has a decay function onModels and dropout. - You get the best diverse
Models usingfeyn.best_diverse_models. This replacesQGraph.best. - Now that you operate on a list of
Models, you have the freedom of using the native pythonfilterfunction.- You still have a list of useful
feyn.filterfunctions but instead apply to the native pythonfilter. These functions are. Refer to the documentation for more details:ComplexityContainsInputsExcludeFunctionsContainsFunctions
- You still have a list of useful
- You can display a
Modelas a graph using theshow_modelfunction.show_modeldetects whether you are in aJupyterenvironment and decides what to display.
- A list of
Other changes:
plot_summaryis now known asplotthat lives on aModel.- The output of
sympifyhas changed:- Categorical input features are now referenced as
<feature_name>_cat. - Numerical input names get suffixed with
_in. - Underscores and spaces in input names are truncated.
- Categorical input features are now referenced as
sympifynow supportsinclude_weights=Falsewhich gives an equation without weights and bias.sympifynow gives consistent results on up to 15 significant digits.plot_goodness_of_fitis now calledplot_regression.plot_regression_metricshas been removed - its uses should be covered byplot_regression.
Added
plot_roc_curvehas athresholdparameter that will plot on the false positive rate and true positive rate at the giventhreshold.plot_confusion_matrixhas athresholdparameter.
Removed
- Our experimental feature module
__future__has for now outlived its purpose and has been removed.
[1.6.1] - 2021-05-04
[1.6.0] - 2021-05-04
Added
- (future) Graph Recorder object that incorporates barcode and feature frequency matrix and heatmaps (now called feature occurrence)
Fixed
- Fixed a warning from matplotlib about duplicate cmap registering for newer versions of mpl.
[1.5.6] - 2021-04-23
Fixed
- Fixed bug in ROC plot function.
[1.5.5] - 2021-04-21
Fixed
- Fixed bug in qlattice migration script.
[1.5.4] - 2021-04-21
Added
- New sematic type "bool" which will only use functions that makes sense for boolean data types (add, multiply, gaussian2)
- New statistical plots,
plot_goodness_of_fitandplot_residualstofeyn.plots. - New transient QLattice feature.
plot summaryimprovements:- Pearson correlation is now default - and now properly displays negative instead of absolute correlation
- MI correlation now follows a simple linear colormap that better represents it.
- Spearman correlation is now available as an alternative correlation function
- Introduce
plot_flowand a Jupyter-widget-enabledplot_flow_interactivefor graphs, allowing you to play around with samples and see the activations through the graph. - (future) Barcode plot and feature frequency matrix and heatmap plots.
- Query language available through
feyn.filtersupdated- New matching starts from the output of the graph.
- Can write queries using
+and*operators. - Wildcards matching any subgraph
_, complexity can be constrained with edge count in brackets.
- New
show_thresholdparameter inplot_roc_curvethat colours the ROC curve by thresholds.
Changed
- Backwards compatability QLattice-urls has been removed from
feyn.QLattice(). Now the only accepted usages are:feyn.QLattice(qlattice="<qlattice-id>", api_token="<token>").feyn.QLattice(config="<name of a section in your config file>").feyn.QLattice() # First section in your config file.
[1.5.3] - 2021-03-26
Changed
- General algorithm improvements.
Added
- Support for up to 2000 registers (Input features). Previously 200.
- Made it easier to trigger the hover information on graphs in notebooks.
[1.5.2] - 2021-03-11
Fixed
- Fixed bug in random seed, which caused QLattice.reset() to always use the same seed.
[1.5.1] - 2021-03-10
Changed
- Improve deprecation warning, so that it is obvious how to migrate the old configuration file.
[1.5.0] - 2021-03-10
Changed
- The parameters for the QLattice initializer has changed. You now only have to specify the
qlatticeand thetokeninstead of the full url. - With this, also the configuration-file format has changed accordingly.
urlhas been replaced byserverandqlattice. The old format still works, but support for it will be removed in a future release. A compatibility warning will be displayed for now.
Fixed
- plot_partial2d: Fixed to use new contract when getting graph state.
[1.4.8] - 2021-02-26
Changed
- The QGraph.fit supports using Akaike Information Criterion (AIC) or Bayesian information criterion (BIC). This may become the default in the future, reducing the need for limiting depth and edges manually
- Default threads used for fitting ans sorting changed from 1 to 4
[1.4.7] - 2021-02-12
Changed
- General performance improvements in finding good graphs.
Added
- Add roc_auc_score to
feyn.metricswhich will calculate the AUC of a graph. (Also accessible ongraph.roc_auc_score). - Plotting style improvements:
- 'light' is now usable as alias to 'default' when setting theme.
- Matplotlib plot styling now matches the theme choice.
- Added colormaps to use with matplotlib: 'feyn', 'feyn-diverging'. 'feyn-partial', 'feyn-primary', 'feyn-secondary', 'feyn-highlight', 'feyn-accent'.
- Added
FeatureImportanceTabletofeyn.insights. (The equivalent functionality was previous inpdheatmapfrom__future__). - (future) Add various stats functions.
graph_f_scoreandplot_graph_p_value.
[1.4.6] - 2021-01-08
Added
- Targeted Maximum Likelihood Estimation (TMLE) introduced in
feyn.inference. See more in our docs
Changed
- Default graphs sort back to loss_value instead of bic.
feyn.tools.simpify_graphdefault option is now to not formulate the logistic function, but instead output “logreg(…)“. Use argumentsymbolic_lr=Trueif you want to keep previous behavior.- Categorical variables rendered in the sympify function from category(<X_featurename>) to category_
.
[1.4.5] - 2020-12-18
Added
- Python 3.9 support
Fixed
- Fix memory bug when handling many registers (>165) in QLattice.
[1.4.4] - 2020-12-18
Removed
metrics.get_mutual_information(),metrics.get_pearson_correlations(),metrics.get_summary_information(). The functionality is now covered bymetrics.calculate_mi(),metrics.calculate_pc()in the public API.
Changed
- Even more general performance improvements.
[1.4.3] - 2020-12-04
Changed
- General performance improvements.
Added
Graph.plot_partial()andGraph.plot_partial2d()to analyze the graph response.metrics.calculate_mi(),metrics.calculate_pc()to calculate mutual information and pearson correlations.
[1.4.2] - 2020-10-26
Graph.sympify()which returns a sympy expression mathcing the graph- Mutual information and pearson correlations are now calculated on entire data set, giving more accurate results
Graph.fit()function which can be used to fit or refit a single graph on a dataset- Adding support to both numerical and categorical partial dependence plots
- Bugfix: 1d plots with categoricals ordered wrt their weights
- Bugfix: Fix support np-dict for graph_summary
[1.4.1] - 2020-10-09
- Added linear and constant reference models (in
feyn.reference) to compare with and calculate p-values (lives infeyn.metrics). - Graph vizualizations rewritten and much improved.
- Dark theme support!
[1.4.0] - 2020-09-03
- ql.update now accepts either a single graph or a list of graphs.
- Added methods:
QLattice.get_regressorandQLattice.get_classifierto replaceQLattice.get_qgraph. - New mathematical functions:
add,expandlog. - You can now control functions in graphs with new filters:
feyn.filters.Functions(["add", "multiply"])andfeyn.filters.ExcludeFunctions("sine"). - New plot. ROC-curves.
[1.3.3] - 2020-08-14
- Shorthands for plotting and score utility functions on feyn.Graph
- New approach to damping learning rates lead to more accurate fits
- Max-depth filter is less strict on which type of integers it accepts.
- Add automatric retries on failed http-requests
- Configurations can now also be stored in
<home_folder>/.config/.feynrc
[1.3.1] - 2020-07-07
- The new automatic scalar is now default on both input and output.
- Alternative input and output semantic types (f#) that does not scaling
[1.3.0] - 2020-07-06
- Added a new scaler: f$. It is more automatic.
[1.2.1] - 2020-06-16
- Changed the configuration environment variable
QLATTICE_BASE_URItoFEYN_QLATTICE_URL. - Changed the configuration environment variable
FEYN_TOKENtoFEYN_QLATTICE_API_TOKEN. - Support for configuration via config file.
feyn.inior.feynrclocated in your home folder. - Breaks compatibility with qlattice <= 1.1.2
- Removed the neeed to add registers via qlattice.registers.get (and removed qlattice.registers.get)
- New parameter to get_qgraph function to choose the semantic type of the data colums (this replaces the need cat/fixed register types)
- Fixes bug with numpy 1.15 and multiarray import in windows 64bit
[1.1.2] - 2020-05-11
- Added Windows Support!
- Removed dependency to GraphViz
- Removed dependency to scikit-learn