Split datasets into random subsets
This function is used to split a dataset into random subsets - typically training and test data.
The input dataset should be either a pandas DataFrames or a dictionary of numpy arrays. The ratio parameter controls how the data is split, and how many subsets it is split into.
Example: Split data in the ratio 2:1 into train and test data
>>> train, test = feyn.tools.split(data, [2,1])
Example: Split data in to train, test and validation data. 80% training data and 10% validation and holdout data each
>>> train, validation, holdout = feyn.tools.split(data, [.8, .1, .1])
Arguments:
data -- The data to split (DataFrame or dict of numpy arrays).
ratio -- the size ratio of the resulting subsets
Returns:
list of subsets -- Subsets of the dataset (same type as the input dataset).