This repository contains speech transcriptions of human-human dialogues collected in 2009 and 2010 through the ANR DECODA project (Ref: 2009 CORD 005 01) at the RATP call center in Paris. All files are anonymized manually. The repository contains the transcriptions made with the <ahref="https://hal.archives-ouvertes.fr/hal-01690349">Transcriber</a> tool as well as linguistic annotations made during the project.
The content of this repository is as follows:
-`data` : contains all data linked to the DECODA project
-`annotations` : linguistic annotations performed on the manual trancriptions of the dialogue speech data
-`annot-v2014` : annotations obtained with the <ahref="https://hal.archives-ouvertes.fr/hal-01194861">MACAON NLP tool suite</a> in 2014
-`tsv-2009` : annotations on the 2009 partition of the DECODA corpus
-`text` : readable form of the transcriptions without any markup or annotations
-`text-2009` : 2009 partition of the DECODA corpus
-`transcriptions` : manual transcriptions (Transcriber format) of the DECODA speech files
-`trs-2009` : 2009 partition of the DECODA corpus
## Reference
Please cite this reference for any publication using the DECODA data:
[DECODA: a call-centre human-human spoken conversation corpus](http://www.lrec-conf.org/proceedings/lrec2012/pdf/684_Paper.pdf)(Bechet et al., LREC 2012)
Frederic Bechet, Benjamin Maza, Nicolas Bigouroux, Thierry Bazillon, Marc El-Bèze, Renato De Mori, and Eric Arbillot. 2012. DECODA: a call-centre human-human spoken conversation corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1343–1347, Istanbul, Turkey. European Language Resources Association (ELRA)
`
@inproceedings{bechet-etal-2012-decoda,
title = "{DECODA}: a call-centre human-human spoken conversation corpus",
author = "Bechet, Frederic and
Maza, Benjamin and
Bigouroux, Nicolas and
Bazillon, Thierry and
El-B{\`e}ze, Marc and
De Mori, Renato and
Arbillot, Eric",
booktitle = "Proceedings of the Eighth International Conference on Language Resources and Evaluation ({LREC}'12)",
month = may,
year = "2012",
address = "Istanbul, Turkey",
publisher = "European Language Resources Association (ELRA)",
<arel="license"href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><imgalt="Creative Commons License"style="border-width:0"src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png"/></a><br/><spanxmlns:dct="http://purl.org/dc/terms/"property="dct:title">DECODA Corpus</span> by <spanxmlns:cc="http://creativecommons.org/ns#"property="cc:attributionName">ANR DECODA Ref: 2009 CORD 005 01</span> is licensed under a <arel="license"href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.<br/>Based on a work at <axmlns:dct="http://purl.org/dc/terms/"href="https://aclanthology.org/L12-1399"rel="dct:source">https://aclanthology.org/L12-1399</a>.