Skip to content
Snippets Groups Projects
Commit ac8aa2bc authored by Paul Best's avatar Paul Best
Browse files

Update README.md

parent b32cd08d
No related branches found
No related tags found
No related merge requests found
......@@ -10,11 +10,12 @@ Convenient to iterate over the whole dataset
```
from metadata import species
for specie in species:
wavpath, FS, nfft = species[specie].values()
wavpath, FS, nfft, downsample, step = species[specie].values()
# iterate over files (one per vocalisation)
for fn in tqdm(glob(wavpath), desc=specie):
sig, fs = sf.read(fn) # read soundfile
annot = pd.read_csv(f'{fn[:-4]}.csv') # read annotations (one column Time in seconds, one column Freq in Herz)
preds = pd.read_csv(f'{fn[:-4]}_preds.csv') # read the file gathering per algorithm f0 predictions
```
### print_annot.py
......@@ -31,7 +32,7 @@ Runs all baseline algorithms over the dataset.
- [x] crepe (original tensorflow implem https://arxiv.org/abs/1802.06182)
- [x] basic pitch (https://arxiv.org/abs/2203.09893)
- [x] pesto (https://arxiv.org/abs/2309.02265)
- [~] pesto finetuned over the target species
- [x] pesto finetuned over the target species
This scripts stores predictions along with resampled annotations in `{basename}_preds.csv` files
......@@ -47,11 +48,12 @@ Scores are stored in `scores/{specie}_scores.csv` files
### compute_salience_SHR.py
Evaluates metrics for each annotated temporal bin:
- the presence of a sub-harmonic following [this paper](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=cb8f47c23c74932152456a6f7a464fd3a2321259)
- the saliency of the annotation as the ratio of the energy of the f0 (one tone around the annotation) and its surrounding (one octave around the annotation)
These values are stored in the SHR and salience columns of the {basename}_preds.csv files
- the salience of the annotation as the ratio of the energy of the f0 (one tone around the annotation) and its surrounding (one octave around the annotation)
- the harmonicity of the annotation as the ratio between the energy of all harmonics and that of all harmonics except the fundamental
These values are stored in the SHR, harmonicity and salience columns of the {basename}_preds.csv files
### get_noisy_pngs.py
Thresholds saliency and SHR values vocalisation wise and copies those considered "noisy" into the `noisy_pngs` folder to browse and check results. One can then check if the spectrogram exists in `noisy_pngs` to discard noisy labels.
Thresholds salience and SHR values vocalisation wise and copies those considered "noisy" into the `noisy_pngs` folder to browse and check results. One can then check if the spectrogram exists in `noisy_pngs` to discard noisy labels.
### train_crepe.py
Fine tunes the crepe model using the whole dataset.
......@@ -61,7 +63,11 @@ Fine tunes the crepe model using the whole dataset.
- [x] Train on one target species given as argument (weights are stored in `crepe_ft/model_only_{specie}.pth)
- [x] Train on all species except the target given as argument (weights are stored in `crepe_ft/model_omit_{specie}.pth)
### Plotting
Scripts allow to generate plots to visualise results (they are saved as `.pdf` files in the `figures` folder)
- `plot_freq_distrib.py` generates a three panel subplot with violins showing distributions of f0 annotations in Hz, number of voiced bins per vocalisation, and modulation rate in (Hz/sec)
- `plot_snr_distrib.py` generates a three panel subplot with violins showing distributions of salience, SHR and harmonicity (see `compute_salience_SHR.py`)
- `plot_scores_bars.py` generates a 4 panel subplot showing performances for each species/algorithm combination. Arguments can be set to generate this plot over all vocalisations, sjipping vocalisations with a low salience / high SHR, or skipping time bins with a low salience / high SHR.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment