Skip to content
Snippets Groups Projects
Select Git revision
  • main default protected
  • V1
2 results

raven2yolo

  • Clone with SSH
  • Clone with HTTPS
  • RAVEN2YOLO

    This GitHub repository was created to simplify learning YOLOv5 and to adapt it for bioacoustics. It supports a paper analyzing Humpback Whale (Megaptera novaeangliae) vocalizations through the automatic detection and classification of 28 units.

    See : Publication


    This repository includes essential scripts for adapting YOLOv5 to bioacoustics research. Key scripts provided are:




    To use the scripts with Raven annotation software (Raven Lite), export annotations in the recommended format and follow these steps:

    • go into the folder that contains the scripts
    • Run get_train_annot.py
    • Launch training following instructions



    To use the scripts without Raven annotation software, you can follow these steps:

    • Run get_spectrogram.py
    • Install Labelme (pip install labelme) and annotate the spectrograms
    • Run labelme2yolo.py
    • Run get_train_val.py
    • Launch training



    The get_time_freq_detection.py script compiles detections into a NetCDF (.nc) file, detailing minimum, mid-range, and maximum frequency, duration, detection position, and model confidence.

    Additional scripts may be added over time to automate other processes.


    Open in Colab




    • For proper citation when using this methodology, please refer to the provided CITATION.cff file.



    Install

    git clone https://gitlab.lis-lab.fr/stephane.chavin/raven2yolo.git
    
    pip install -r requirements.txt 

    Spectrogram Extraction Script

    Description

    This script extracts spectrograms from .wav files. It allows you to specify various parameters such as duration, window size, hop ratio, high and low pass filters, overlap, resampling frequency, and CPU usage to optimize the process.

    Usage

    To run the script, use the following command:

    python get_spectrogram.py <path> <directory> [options]

    Arguments

    Positional Arguments

    • path: Path to the folder or file that contains the recordings.
    • directory: Directory where the extracted spectrograms will be stored.

    Optional Arguments

    --duration (int): Duration for each spectrogram. Default is 8.
    --window (int): Window size for the Fourier Transform. Default is 1024.
    --hop (float): Ratio of hop in window. 50%% corresponds to 0.5. Default is 0.5.
    --high (int): High Pass Filter value in Hz. Default is 10.
    --low (int): Low Pass Filter value in Hz. Default is None.
    --overlap (int): Overlap in seconds between two spectrograms. Default is 0.
    --rf (int): Resampling Frequency of the signal. If not provided, the original frequency sampling of the recording will be used. Default is None.
    --cpu (int): Number of CPUs to use for processing to speed up the process. Provide 2 or more. Default is 1.
    --test (flag): If provided, sets the test flag to 1, otherwise it is None.

    LabelMe to YOLO Annotation Converter

    Description

    This script converts annotations from the LabelMe format to a YOLO compatible format. It allows you to specify the path to the LabelMe annotations and the directory where the converted YOLO annotations will be stored.

    Usage

    To run the script, use the following command:

    python labelme2yolo.py <path_to_data> <directory>

    Arguments

    Positional Arguments

    • path_to_data: Path to the folder that contains the LabelMe annotations.
    • directory: Directory where the YOLO annotations will be stored.

    YOLO to JSON/LabelMe Annotation Converter

    Description

    This script converts annotations from the YOLO format (stored in .txt files) to JSON files. It allows you to specify the path to the folder containing the .txt files, the path to the folder containing the images, and optionally the directory where the modified JSON files will be stored.

    Usage

    To run the script, use the following command:

    python yolo2labelme.py <path_to_txt> <path_to_img> [options]

    Arguments

    Positional Arguments

    • path_to_txt: Path to the folder containing the .txt files.
    • path_to_img: Path to the folder containing the .jpg images.

    Optional Arguments

    -d, --directory (str): Directory where the modified JSON files will be stored. If not provided, the directory will be the same as path_to_txt.


    Data Splitting and Storage Script

    Description

    This script splits data into training, validation, and optionally test sets based on a specified ratio. It allows you to specify the path to the folder containing the .txt files, the directory where the spectrogram and .txt files will be stored, and an optional test flag to include a test split.

    Usage

    To run the script, use the following command:

    python get_train_val.py <path_to_data> <directory> [options]

    Arguments

    Positional Arguments

    • path_to_data: Path to the folder that contains the .txt files (should end with labels/).
    • directory: Directory where the spectrogram and .txt files will be stored (should be different from <path_to_data>).

    CSV to Spectrogram and Annotation Converter

    Description

    This script creates .txt and .jpg files for each annotation from a CSV file. It takes in the path to the CSV file containing annotations, the path to the folder containing the recordings, and the directory where the spectrograms and .txt files will be stored. The script includes options for setting the duration and overlap of the spectrograms, frequency resampling, window size, hop ratio, CPU usage, and an optional test flag to include a test split.

    Usage

    To run the script, use the following command:

    python get_train_annot.py <filename_path> <path_to_data> <directory> [options]

    Arguments

    Positional Arguments

    • filename_path: Path/name of the folder/file containing the annotations. If a file, use Raven format and add a Path column with the path to the .wav files.
    • path_to_data: Path of the folder that contains the recordings.
    • directory: Directory where the spectrograms and .txt files will be stored.

    Optional Arguments

    --duration (int): Duration for each spectrogram. Default is 8.
    --overlap (int): Overlap in seconds between two spectrograms. Default is 2.
    --rf (int): Frequency resampling. Default is None.
    --window (int): Window size for the Fourier Transform. Default is 1024.
    --hop (float): Ratio of hop in window (e.g., 50% = 0.5). Default is 0.5.
    --cpu (int): Number of CPUs to speed up the process. Default is 1.
    --test (flag): If provided, splits the data into train/test/validation sets with the ratio 1 - Ratio / 2 for test and the same for validation. If not provided, only train and validation splits are created.
    --minimum : If True, vmin will be stft.mean(), else stft.min().

    Detection Collector and DataFrame Generator

    Description

    This script collects detections from .txt files and returns a complete dataframe. It takes in the path to the folder containing the .txt files, the directory where the dataframe will be stored, and the path to the YOLOv5 custom_data.yaml file. Additionally, it allows specifying the sampling rate and the duration of the spectrogram.

    Usage

    To run the script, use the following command:

    python get_time_freq_detection.py <path_to_data> <directory> <names> [options]

    Arguments

    Positional Arguments

    • path_to_data: Path to the folder that contains the .txt files.
    • directory: Directory where the dataframe will be stored.
    • names: Path to the YOLOv5 custom_data.yaml file.

    Optional Arguments

    -s, --sr (int): Sampling rate of the spectrogram. This argument is required.
    --duration (int): Duration of the spectrogram. Default is 8.

    Training a YOLOv5 model


    For this project, we adapt the YOLOv5 DataLoader to compute detection on a folder that contains .WAV files. If you need more informations about YOLOv5, see: https://github.com/ultralytics/yolov5
    • Jocher Glenn (2020), "YOLOv5 by Ultralytics", doi: 10.5281/zenodo.3908559, license: AGPL-3.0

    python yolov5/train.py --imgsz <IMG_SIZE> --batch <BATCH_SIZE> --epochs <NB_EPOCHS> --data <custom_data.yaml> --weights yolov5/weights/yolov5l.pt --hyp <custom_hyp.yaml> --cache

    Detection


    • Detect on audio files
    python detect.py --weights yolov5/runs/train/<EXP_NB>/weights/best.pt --imgsz <imgsz> --conf <conf> --source <PATH_TO_FOLDER_THAT_CONTAIN_WAV> --save-txt --sound --rf <RF> --sampleDur <SampleDur> --minimum <True/False --window <window> --hop <hop> --low <low> --high <high> --cmap <cmap> --save-conf

    • Detect on audio files without saving the detection images
    python detect.py --weights yolov5/runs/train/<EXP_NB>/weights/best.pt --imgsz <imgsz> --conf <conf> --source <PATH_TO_FOLDER_THAT_CONTAIN_WAV> --save-txt --sound --rf <RF> --sampleDur <SampleDur> --minimum <True/False> --window <window> --hop <hop> --low <low> --high <high> --cmap <cmap> --save-conf --nosave

    Arguments :

    --imgsz: Inference size height and width. Default is 640.
    --sampleDur: Duration for each spectrogram for detection. Default is 8.
    --sr: Samplerate for each spectrogram for detection. Default is 22050.
    --window: Window size for each spectrogram for detection. Default is 1024.
    --hop: Hop length for each spectrogram for detection. Default is 50% of window = 512.
    --sound: Enable sound. Default is False. Action 'store_true'.
    --low: Low pass filter value.
    --high: High pass filter.
    --cmap: Colormap for the Spectrograms.
    --minimum : If True, vmin will be stft.mean(), else stft.min().

    Contact

    If you have any questions, please contact me at the following e-mail address : stephane.chavin@univ-tln.fr


    Contributors



    H. Glotin managed the data storage and the human ressources for the realisation of this software within the framework of AI Chair ADSIL ANR-20-CHIA-0014 (Agence Nationale de la Recherche and DGA AID), SYLVANIA ANR-21-CE04-0019 and the BIODIVERSA EUROPAM 2023-2026 projects.