Example 2 : Understanding the hyper-parameter optimization
Intuitive explanation on hyper-parameters
Hyper-parameters are parameters of a classifier (monoview or multiview) that are task-dependant and have a huge part in the performance of the algorithm for a given task.
The simplest example is the decision tree. One of it's hyper-parameter is the depth of the tree. The deeper the tree is, the most it will fit on the learning data. However a tree too deep will most likely overfit and won't have any value on unseen testing data.
This platform proposes a randomized search for optimizing hyperparamter on the given task. In this example, we first will analyze how it works and then how to use it.
Understanding train/test split
In order to provide robust results, this platform splits the dataset in a training set, tha will be used by the classifier to optimize their hyper-parameter and learn a relevant model, and a testing set that will take no part in the learning process and serve as unseen data to estimate each model's generalization capacity.
This split is controlled by the config file's argument split:
. It uses a float to pass the ratio between the size of the testing set and the training set :
\text{split} = \frac{\text{test size}}{\text{train size}}. In order to be as fare as possible, this split is made by keeping the ratio btween each class in the training set and in the testing set.