Skip to content
Snippets Groups Projects
Select Git revision
  • c3c7ac1b1c89e9ccb8540f614fb5f6f62d388885
  • master default protected
2 results

f0_estimation

Cross-species F0 estimation, dataset and study of baseline algorithms

infos on current collaborations in https://docs.google.com/document/d/179dD1d6lmWhQ9e2E1AUoJyLZ1d_5c-aMNwAPkbmIcBU

metadata.py

Stores a dictionary of datasets and characteristics (SR, NFFT, path to access soundfiles, and downsampling factor for ultra/infra-sonic signals) Convenient to iterate over the whole dataset

from metadata import species
for specie in species:
    wavpath, FS, nfft = species[specie].values()
    # iterate over files (one per vocalisation)
    for fn in tqdm(glob(wavpath), desc=specie):
        sig, fs = sf.read(fn) # read soundfile
        annot = pd.read_csv(f'{fn[:-4]}.csv') # read annotations (one column Time in seconds, one column Freq in Herz)

print_annot.py

For each vocalisation, prints a spectrogram and overlaid annotations as .png file stored in the annot_pngs folder.

run_all.py

Runs all baseline algorithms over the dataset.

This scripts stores predictions along with resampled annotations in {basename}_preds.csv files

print_annot.py

For each vocalisation, prints a spectrogram and overlaid annotations and predictions as .png file stored in the pred_pngs folder.

eval_all.py

Evaluates each algorithms over the dataset using {basename}_preds.csv files, with a threshold of 50 cents for accuracies. For each algorithms and species, this outputs ROC optimal thresholds, Recall, False alarm, Pitch accuracy, and Chroma accuracy. /!\ These metrics are mesured per vocalisation before being averaged. Scores are stored in scores/{specie}_scores.csv files

compute_salience_SHR.py

Evaluates metrics for each annotated temporal bin:

  • the presence of a sub-harmonic following this paper
  • the saliency of the annotation as the ratio of the energy of the f0 (one tone around the annotation) and its surrounding (one octave around the annotation) These values are stored in the SHR and salience columns of the {basename}_preds.csv files

get_noisy_pngs.py

Thresholds saliency and SHR values vocalisation wise and copies those considered "noisy" into the noisy_pngs folder to browse and check results. One can then check if the spectrogram exists in noisy_pngs to discard noisy labels.

train_crepe.py

Fine tunes the crepe model using the whole dataset.

  • Loads 1024 sample windows and their corresponding f0 to be stored in a large train_set.pkl file (skip if data hasn't changed).
  • Applies gradient descent using the BCE following the crepe paper (this task is treated as a binary classification for each spectral bin).
  • The fine tuned model is stored in crepe_ft/model_all.pth
  • Train on one target species given as argument (weights are stored in `crepe_ft/model_only_{specie}.pth)
  • Train on all species except the target given as argument (weights are stored in `crepe_ft/model_omit_{specie}.pth)