@@ -13,7 +13,7 @@ To settle this issue, the platform can run on multiple splits and return the mea
...
@@ -13,7 +13,7 @@ To settle this issue, the platform can run on multiple splits and return the mea
How to use it
How to use it
-------------
-------------
This feature is controlled by a single argument : ``stats_iter:`` in the ``Classification`` section of the config file.
This feature is controlled by a single argument : ``stats_iter:`` in the config file.
Modifying this argument and setting more than one ``stats_iter`` will slightly modify the result directory's structure.
Modifying this argument and setting more than one ``stats_iter`` will slightly modify the result directory's structure.
Indeed, as the platform will perform a benchmark on multiple train/test split, the result directory will be larger in order to keep all the individual results.
Indeed, as the platform will perform a benchmark on multiple train/test split, the result directory will be larger in order to keep all the individual results.
In terms of pseudo-code, if one uses HPO, it adds a for loop on the pseudo code displayed in example 2 ::
In terms of pseudo-code, if one uses HPO, it adds a for loop on the pseudo code displayed in example 2 ::
...
@@ -49,15 +49,23 @@ The result directory will be structured as :
...
@@ -49,15 +49,23 @@ The result directory will be structured as :
| | └── train_indices.csv
| | └── train_indices.csv
| | ├── 1560_12_25-15_42-*-LOG.log
| | ├── 1560_12_25-15_42-*-LOG.log
| | ├── config_file.yml
| | ├── config_file.yml
| | ├── *-accuracy_score.png
| | ├── *-accuracy_score.
| | ├── *-accuracy_score-class.html
| | ├── *-accuracy_score.html
| | ├── *-accuracy_score.csv
| | ├── *-accuracy_score.csv
| | ├── *-f1_score.png
| | ├── *-f1_score.png
| | ├── *-f1_score.csv
| | ├── *-f1_score.csv
| | ├── *-f1_score-class.html
| | ├── *-f1_score.html
| | ├── *-error_analysis_2D.png
| | ├── *-error_analysis_2D.png
| | ├── *-error_analysis_2D.html
| | ├── *-error_analysis_2D.html
| | ├── *-error_analysis_bar.png
| | ├── *-error_analysis_bar.png
| | ├── *-error_analysis_bar.HTML
| | ├── *-bar_plot_data.csv
| | ├── *-bar_plot_data.csv
| | ├── *-2D_plot_data.csv
| | ├── *-2D_plot_data.csv
| | ├── feature_importances
| | ├── [..
| | ├── ..]
| | ├── adaboost
| | ├── adaboost
| | | ├── ViewNumber0
| | | ├── ViewNumber0
| | | | ├── *-summary.txt
| | | | ├── *-summary.txt
...
@@ -65,7 +73,7 @@ The result directory will be structured as :
...
@@ -65,7 +73,7 @@ The result directory will be structured as :
| | | ├── ViewNumber1
| | | ├── ViewNumber1
| | | | ├── *-summary.txt
| | | | ├── *-summary.txt
| | | | ├── <other classifier dependant files>
| | | | ├── <other classifier dependant files>
| | | | ├── ViewNumber2
| | | ├── ViewNumber2
| | | | ├── *-summary.txt
| | | | ├── *-summary.txt
| | | | ├── <other classifier dependant files>
| | | | ├── <other classifier dependant files>
| | ├── decision_tree
| | ├── decision_tree
...
@@ -92,11 +100,16 @@ The result directory will be structured as :
...
@@ -92,11 +100,16 @@ The result directory will be structured as :
| ├── config_file.yml
| ├── config_file.yml
| ├── *-accuracy_score.png
| ├── *-accuracy_score.png
| ├── *-accuracy_score.csv
| ├── *-accuracy_score.csv
| ├── *-accuracy_score.html
| ├── *-accuracy_score-class.html
| ├── *-f1_score.png
| ├── *-f1_score.png
| ├── *-f1_score.csv
| ├── *-f1_score.csv
| ├── *-f1_score.html
| ├── *-f1_score-class.html
| ├── *-error_analysis_2D.png
| ├── *-error_analysis_2D.png
| ├── *-error_analysis_2D.html
| ├── *-error_analysis_2D.html
| ├── *-error_analysis_bar.png
| ├── *-error_analysis_bar.png
| ├── *-error_analysis_bar.html
| ├── *-bar_plot_data.csv
| ├── *-bar_plot_data.csv
| ├── *-2D_plot_data.csv
| ├── *-2D_plot_data.csv
| ├── feature_importances
| ├── feature_importances
...
@@ -112,8 +125,8 @@ If you look closely, nearly all the files from Example 1 are in each ``iter_`` d
...
@@ -112,8 +125,8 @@ If you look closely, nearly all the files from Example 1 are in each ``iter_`` d
So, the files stored in ``started_1560_12_25-15_42/`` are the one that show the mean results on all the statistical iterations.
So, the files stored in ``started_1560_12_25-15_42/`` are the one that show the mean results on all the statistical iterations.
For example, ``started_1560_12_25-15_42/*-accuracy_score.png`` looks like :
For example, ``started_1560_12_25-15_42/*-accuracy_score.png`` looks like :
.. figure:: ./images/accuracy_mean.png
.. raw:: html
:scale: 25
./images/accuracy_mean.html
The main difference between this plot an the one from Example 1 is that here, the scores are means over all the statistical iterations, and the standard deviations are plotted as vertical lines on top of the bars and printed after each score under the bars as "± <std>".
The main difference between this plot an the one from Example 1 is that here, the scores are means over all the statistical iterations, and the standard deviations are plotted as vertical lines on top of the bars and printed after each score under the bars as "± <std>".
...
@@ -121,9 +134,13 @@ Then, each iteration's directory regroups all the results, structured as in Exam
...
@@ -121,9 +134,13 @@ Then, each iteration's directory regroups all the results, structured as in Exam
**Example with stats iter**
Example
<<<<<<<
**Duration ??**
Duration
<<<<<<<<
Increasing the number of statistical iterations can be costly in terms of computational resources