diff --git a/README.md b/README.md index a8290641562cb6d5973441cab97d93f70c9841ab..c819dca7c1a70fbb1874154c7c7ea60612227912 100644 --- a/README.md +++ b/README.md @@ -10,11 +10,12 @@ Convenient to iterate over the whole dataset ``` from metadata import species for specie in species: - wavpath, FS, nfft = species[specie].values() + wavpath, FS, nfft, downsample, step = species[specie].values() # iterate over files (one per vocalisation) for fn in tqdm(glob(wavpath), desc=specie): sig, fs = sf.read(fn) # read soundfile annot = pd.read_csv(f'{fn[:-4]}.csv') # read annotations (one column Time in seconds, one column Freq in Herz) + preds = pd.read_csv(f'{fn[:-4]}_preds.csv') # read the file gathering per algorithm f0 predictions ``` ### print_annot.py @@ -31,7 +32,7 @@ Runs all baseline algorithms over the dataset. - [x] crepe (original tensorflow implem https://arxiv.org/abs/1802.06182) - [x] basic pitch (https://arxiv.org/abs/2203.09893) - [x] pesto (https://arxiv.org/abs/2309.02265) -- [~] pesto finetuned over the target species +- [x] pesto finetuned over the target species This scripts stores predictions along with resampled annotations in `{basename}_preds.csv` files @@ -47,11 +48,12 @@ Scores are stored in `scores/{specie}_scores.csv` files ### compute_salience_SHR.py Evaluates metrics for each annotated temporal bin: - the presence of a sub-harmonic following [this paper](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=cb8f47c23c74932152456a6f7a464fd3a2321259) -- the saliency of the annotation as the ratio of the energy of the f0 (one tone around the annotation) and its surrounding (one octave around the annotation) -These values are stored in the SHR and salience columns of the {basename}_preds.csv files +- the salience of the annotation as the ratio of the energy of the f0 (one tone around the annotation) and its surrounding (one octave around the annotation) +- the harmonicity of the annotation as the ratio between the energy of all harmonics and that of all harmonics except the fundamental +These values are stored in the SHR, harmonicity and salience columns of the {basename}_preds.csv files ### get_noisy_pngs.py -Thresholds saliency and SHR values vocalisation wise and copies those considered "noisy" into the `noisy_pngs` folder to browse and check results. One can then check if the spectrogram exists in `noisy_pngs` to discard noisy labels. +Thresholds salience and SHR values vocalisation wise and copies those considered "noisy" into the `noisy_pngs` folder to browse and check results. One can then check if the spectrogram exists in `noisy_pngs` to discard noisy labels. ### train_crepe.py Fine tunes the crepe model using the whole dataset. @@ -61,7 +63,11 @@ Fine tunes the crepe model using the whole dataset. - [x] Train on one target species given as argument (weights are stored in `crepe_ft/model_only_{specie}.pth) - [x] Train on all species except the target given as argument (weights are stored in `crepe_ft/model_omit_{specie}.pth) - +### Plotting +Scripts allow to generate plots to visualise results (they are saved as `.pdf` files in the `figures` folder) +- `plot_freq_distrib.py` generates a three panel subplot with violins showing distributions of f0 annotations in Hz, number of voiced bins per vocalisation, and modulation rate in (Hz/sec) +- `plot_snr_distrib.py` generates a three panel subplot with violins showing distributions of salience, SHR and harmonicity (see `compute_salience_SHR.py`) +- `plot_scores_bars.py` generates a 4 panel subplot showing performances for each species/algorithm combination. Arguments can be set to generate this plot over all vocalisations, sjipping vocalisations with a low salience / high SHR, or skipping time bins with a low salience / high SHR.