PESTO: Pitch Estimation with Self-supervised Transposition-equivariant Objective
tl;dr: Fast and powerful pitch estimator based on machine learning
This code is the implementation of the PESTO paper, that has been accepted at ISMIR 2023.
Disclaimer: This repository contains minimal code and should be used for inference only.
If you want full implementation details or want to use PESTO for research purposes, take a look at this repository (coming soon!).
Installation
pip install pesto-pitch
That's it!
Dependencies
This repository is implemented in PyTorch and has the following additional dependencies:
-
numpy
for basic I/O operations - torchaudio for audio loading
-
matplotlib
for exporting pitch predictions as images (optional)
Usage
Command-line interface
This package includes a CLI as well as pretrained models. To use it, type in a terminal:
pesto my_file.wav
or
python -m pesto my_file.wav
Output formats
The output format can be specified with option -e
/--export_format
.
By default, the predicted pitch is saved in a .csv
file that looks like this:
time,frequency,confidence
0.00,185.616,0.907112
0.01,186.764,0.844488
0.02,188.356,0.798015
0.03,190.610,0.746729
0.04,192.952,0.771268
0.05,195.191,0.859440
0.06,196.541,0.864447
0.07,197.809,0.827441
0.08,199.678,0.775208
...
This structure is voluntarily the same as in CREPE repo for easy comparison between both methods.