GSPR TDOA Hyper resolution
gsrp_tdoa_hyperres.py
is a python script made to compute the time difference of arrival (TDOA)
using the geometric steered response power (GSRP) method.
This script also produces TDOA with a resolution higher than the initial sampling rate by interpolating
the GSPR loss function with a second-order polynomial.
Prerequisite
This program uses the following libraries (tested with the following version, but should work with other versions):
- cython 0.29
- numpy 1.21
- scipy 1.7.1
- soundfile 0.9
- scikit-learn 1.0 (needed if you use the hyper resolution mode)
- pandas 1.3 (needed only for saving in a non CSV format)
- tqdm 4.62 (optional if you want a progress bar)
c_corr.pyx
will automaticly be compiled at the first run of gsrp_tdoa_hyperres.py
.
In case the compilation fail, you can manually compile it with:
python cython_setup.py build_ext --inplace
Usage
usage: gsrp_tdoa_hyperres.py [-h] [-c CHANNELS] [-i INVERSE] [-f FRAME_SIZE]
[-s STRIDE | -p STRIDE] [-m MAX_TDOA]
[-S SECONDS] [-E SECONDS] [-l LOW] [-u UP]
[-d DECIMATE] [-t] [-e] [-n] [-w]
[-M {prepare,on-the-fly,auto,smart}] [-q QUOTA]
[-v]
infile outfile
Arguments
Computes TDOA estimates from a multi-channel recording.
positional arguments:
infile The sound file to process.
outfile The text or npy file to write results to. Each row
gives the position (in samples), cross-correlation
product in decibel (normalized and unormalized), the
independent TDOAs (in samples), and TDOAs derived from
the independent ones.
optional arguments:
-h, --help show this help message and exit
Channels:
-c CHANNELS, --channels CHANNELS
The channels to cross-correlate. Accepts two or more,
but beware of high memory use. To be given as a comma-
separated list of numbers, with 0 referring to the
first channel (default: all channels).
-i INVERSE, --inverse INVERSE
Inverse the channel. To be given as a comma-separated
list of numbers,with 0 referring to the first channel
once channels have been chosen by --channels.
Size settings:
-f FRAME_SIZE, --frame-size FRAME_SIZE
The size of the cross-correlation frames in seconds
(default: 0.02)
-s STRIDE, --stride STRIDE
The step between the beginnings of sequential frames
in seconds (default: 0.01)
-p STRIDE, --pos STRIDE
The position in second from csv file path. Not allowed
if stride is set
-m MAX_TDOA, --max-tdoa MAX_TDOA
The maximum TDOA in seconds (default: 0.0011).
-S SECONDS, --start SECONDS
If given, only analyze from the given position.
-E SECONDS, --end SECONDS
If given, only analyze up to the given position.
Filtering:
-l LOW, --low LOW Bottom cutoff frequency. Disabled by default.
-u UP, --up UP Top cutoff frequency. Disabled by default.
-d DECIMATE, --decimate DECIMATE
Downsample the signal by the given factor. Disabled by
default
-t, --temporal If given, any decimation will be applied in the time
domain instead of the spectral domain.
Other:
-e, --erase Erase existing outfile. If outfile exist and --erase
is not provide, the script will exit.
-n, --no-hyperres Disable the hyper resolution evalutation of the TDOA
-w, --wide Use only one level to concatenate the normal and
hyperres results. Behaviour depends on the output file
type.
-M {prepare,on-the-fly,auto,smart}, --mode {prepare,on-the-fly,auto,smart}
How to explore the TDOA space (default: smart).
prepare precomputes all the possible TDOA
pairs and then evaluate them. All the results are save
in memory.
on-the-fly compute the TDOA pairs at the same
time as it compute the loss function. Only the maximum
is saved. Can be slower than prepare.
smart gradually increase the search space
dimension, reducing the number of tdoa to evaluate.
auto automatically try to pick the right
method.
-q QUOTA, --quota QUOTA
Memory limit in bytes for the
{BColors.BOLD}smart{BColors.ENDC} method. If hit, halt
the computation of the current frame and skip to the
next one. Note that it does not account for other
memory usage, such as the sound data. Can be a unit
such as GB, GiO, Ko, ...
-v, --verbose Activate verbose for smart mode