Skip to content
Snippets Groups Projects
Select Git revision
  • 9d7ef0e707a256c8ba04b13933b3af017a10bcd6
  • master default protected
  • correlation
  • 24-non-negative-omp
  • 15-integration-sota
  • 20-coherence-des-arbres-de-predictions
  • 19-add-some-tests
  • 13-visualization
  • 17-adding-new-datasets
  • 12-experiment-pipeline
  • 14-correction-of-multiclass-classif
  • archive/10-gridsearching-of-the-base-forest
  • archive/farah_notation_and_related_work
  • archive/wip_clean_scripts
  • archive/4-implement-omp_forest_classifier
  • archive/5-add-plots-2
  • archive/Leo_Add_first_notebook
17 results

bolsonaro

Name Last commit Last update
data
README.md

DECODA-PUBLIC

DECODA corpus

Content

This repository contains speech transcriptions of human-human dialogues collected in 2009 and 2010 through the ANR DECODA project (Ref: 2009 CORD 005 01) at the RATP call center in Paris. All files are anonymized manually. The repository contains the transcriptions made with the Transcriber tool as well as linguistic annotations made during the project.

The content of this repository is as follows:

  • data : contains all data linked to the DECODA project
    • annotations : linguistic annotations performed on the manual trancriptions of the dialogue speech data
      • annot-v2014 : annotations obtained with the MACAON NLP tool suite in 2014
        • tsv-2009 : annotations on the 2009 partition of the DECODA corpus
    • text : readable form of the transcriptions without any markup or annotations
      • text-2009 : 2009 partition of the DECODA corpus
    • transcriptions : manual transcriptions (Transcriber format) of the DECODA speech files
      • trs-2009 : 2009 partition of the DECODA corpus

Reference

Please cite this reference for any publication using the DECODA data:

DECODA: a call-centre human-human spoken conversation corpus (Bechet et al., LREC 2012)

Frederic Bechet, Benjamin Maza, Nicolas Bigouroux, Thierry Bazillon, Marc El-Bèze, Renato De Mori, and Eric Arbillot. 2012. DECODA: a call-centre human-human spoken conversation corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1343–1347, Istanbul, Turkey. European Language Resources Association (ELRA)

@inproceedings{bechet-etal-2012-decoda, title = "{DECODA}: a call-centre human-human spoken conversation corpus", author = "Bechet, Frederic and Maza, Benjamin and Bigouroux, Nicolas and Bazillon, Thierry and El-B{\e}ze, Marc and De Mori, Renato and Arbillot, Eric", booktitle = "Proceedings of the Eighth International Conference on Language Resources and Evaluation ({LREC}'12)", month = may, year = "2012", address = "Istanbul, Turkey", publisher = "European Language Resources Association (ELRA)", url = "http://www.lrec-conf.org/proceedings/lrec2012/pdf/684_Paper.pdf", pages = "1343--1347" } `

License

Creative Commons License
DECODA Corpus by ANR DECODA Ref: 2009 CORD 005 01 is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://aclanthology.org/L12-1399.