Merge commit 'c66ba838'

456e40ef · Paul Best · e0a0e06e · c66ba838 · 456e40ef
Commit 456e40ef authored Apr 29, 2024 by Paul Best
--- a/README.md
+++ b/README.md
@@ -10,11 +10,12 @@ Convenient to iterate over the whole dataset
 ```
 from metadata import species
 for specie in species:
-    wavpath, FS, nfft = species[specie].values()
+    wavpath, FS, nfft, downsample, step = species[specie].values()
    # iterate over files (one per vocalisation)
    for fn in tqdm(glob(wavpath), desc=specie):
        sig, fs = sf.read(fn) # read soundfile
        annot = pd.read_csv(f'{fn[:-4]}.csv') # read annotations (one column Time in seconds, one column Freq in Herz)
+        preds = pd.read_csv(f'{fn[:-4]}_preds.csv') # read the file gathering per algorithm f0 predictions
 ```

 ### print_annot.py
@@ -31,7 +32,7 @@ Runs all baseline algorithms over the dataset.
 - [x] crepe (original tensorflow implem https://arxiv.org/abs/1802.06182)
 - [x] basic pitch (https://arxiv.org/abs/2203.09893)
 - [x] pesto (https://arxiv.org/abs/2309.02265)
- [~] pesto finetuned over the target species
+- [x] pesto finetuned over the target species

 This scripts stores predictions along with resampled annotations in `{basename}_preds.csv` files

@@ -47,21 +48,26 @@ Scores are stored in `scores/{specie}_scores.csv` files
 ### compute_salience_SHR.py
 Evaluates metrics for each annotated temporal bin:
 - the presence of a sub-harmonic following [this paper](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=cb8f47c23c74932152456a6f7a464fd3a2321259)
- the saliency of the annotation as the ratio of the energy of the f0 (one tone around the annotation) and its surrounding (one octave around the annotation)
-These values are stored in the SHR and salience columns of the {basename}_preds.csv files
+- the salience of the annotation as the ratio of the energy of the f0 (one tone around the annotation) and its surrounding (one octave around the annotation)
+- the harmonicity of the annotation as the ratio between the energy of all harmonics and that of all harmonics except the fundamental 
+These values are stored in the SHR, harmonicity and salience columns of the {basename}_preds.csv files

 ### get_noisy_pngs.py
-Thresholds saliency and SHR values vocalisation wise and copies those considered "noisy" into the `noisy_pngs` folder to browse and check results. One can then check if the spectrogram exists in `noisy_pngs` to discard noisy labels.
+Thresholds salience and SHR values vocalisation wise and copies those considered "noisy" into the `noisy_pngs` folder to browse and check results. One can then check if the spectrogram exists in `noisy_pngs` to discard noisy labels.

 ### train_crepe.py
 Fine tunes the crepe model using the whole dataset.
 - [x] Loads 1024 sample windows and their corresponding f0 to be stored in a large `train_set.pkl` file (skip if data hasn't changed).
 - [x] Applies gradient descent using the BCE following the crepe paper (this task is treated as a binary classification for each spectral bin).
 - [x] The fine tuned model is stored in `crepe_ft/model_all.pth`
- [x] Train on one target species given as argument (weights are stored in `crepe_ft/model_only_{specie}.pth)
- [x] Train on all species except the target given as argument (weights are stored in `crepe_ft/model_omit_{specie}.pth)
-
-
+- [x] Train on one target species given as argument (weights are stored in `crepe_ft/model_only_{specie}.pth`)
+- [x] Train on all species except the target given as argument (weights are stored in `crepe_ft/model_omit_{specie}.pth`)
+
+### Plotting
+Scripts allow to generate plots to visualise results (they are saved as `.pdf` files in the `figures` folder)
+- `plot_freq_distrib.py` generates a three panel subplot with violins showing distributions of f0 annotations in Hz, number of voiced bins per vocalisation, and modulation rate in (Hz/sec)
+- `plot_snr_distrib.py` generates a three panel subplot with violins showing distributions of salience, SHR and harmonicity (see `compute_salience_SHR.py`)
+- `plot_scores_bars.py` generates a 4 panel subplot showing performances for each species/algorithm combination. Arguments can be set to generate this plot over all vocalisations, sjipping vocalisations with a low salience / high SHR, or skipping time bins with a low salience / high SHR.