diff --git a/README.md b/README.md index bd07a34dfbe31c44e9ef9f5c12db79e9a880b04e..6e4b552e8b60816147a9672285219b32cde35bcc 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,44 @@ ## Content This repository contains speech transcriptions of human-human dialogues collected in 2009 and 2010 through the ANR DECODA project (Ref: 2009 CORD 005 01) at the RATP call center in Paris. All files are anonymized manually. The repository contains the transcriptions made with the <a href="https://hal.archives-ouvertes.fr/hal-01690349">Transcriber</a> tool as well as linguistic annotations made during the project. +The content of this repository is as follows: + +- `data` : contains all data linked to the DECODA project + - `annotations` : linguistic annotations performed on the manual trancriptions of the dialogue speech data + - `annot-v2014` : annotations obtained with the <a href="https://hal.archives-ouvertes.fr/hal-01194861">MACAON NLP tool suite</a> in 2014 + - `tsv-2009` : annotations on the 2009 partition of the DECODA corpus + - `text` : readable form of the transcriptions without any markup or annotations + - `text-2009` : 2009 partition of the DECODA corpus + - `transcriptions` : manual transcriptions (Transcriber format) of the DECODA speech files + - `trs-2009` : 2009 partition of the DECODA corpus + +## Reference + +Please cite this reference for any publication using the DECODA data: + +[DECODA: a call-centre human-human spoken conversation corpus](http://www.lrec-conf.org/proceedings/lrec2012/pdf/684_Paper.pdf) (Bechet et al., LREC 2012) + +Frederic Bechet, Benjamin Maza, Nicolas Bigouroux, Thierry Bazillon, Marc El-Bèze, Renato De Mori, and Eric Arbillot. 2012. DECODA: a call-centre human-human spoken conversation corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1343–1347, Istanbul, Turkey. European Language Resources Association (ELRA) + +` +@inproceedings{bechet-etal-2012-decoda, + title = "{DECODA}: a call-centre human-human spoken conversation corpus", + author = "Bechet, Frederic and + Maza, Benjamin and + Bigouroux, Nicolas and + Bazillon, Thierry and + El-B{\`e}ze, Marc and + De Mori, Renato and + Arbillot, Eric", + booktitle = "Proceedings of the Eighth International Conference on Language Resources and Evaluation ({LREC}'12)", + month = may, + year = "2012", + address = "Istanbul, Turkey", + publisher = "European Language Resources Association (ELRA)", + url = "http://www.lrec-conf.org/proceedings/lrec2012/pdf/684_Paper.pdf", + pages = "1343--1347" +} +` ## License <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br /><span xmlns:dct="http://purl.org/dc/terms/" property="dct:title">DECODA Corpus</span> by <span xmlns:cc="http://creativecommons.org/ns#" property="cc:attributionName">ANR DECODA Ref: 2009 CORD 005 01</span> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.<br />Based on a work at <a xmlns:dct="http://purl.org/dc/terms/" href="https://aclanthology.org/L12-1399" rel="dct:source">https://aclanthology.org/L12-1399</a>.