The Multi Column Files (mcf) format is the text format used to represent text and its annotations. Every line of an mcf corresponds to an atomic unit of text (abusively called word). Each column describes an attribute of the atomic token. Columns are separated by tab characters. The number of columns in an mcf is unbounded. Columns can be associated to a label via an mcd file.
The list of labels is the following:
- FORM
- POS
- LEMMA
- GOV
- LABEL
- SENT_SEG
--- | --- | --- | --- | --- | --- la |det |le |1 |det |0|
|diane |nc |diane |1 |suj |0|
chantait v chanter 0 root 0 dans prep dans -1 mod 0 la det le 1 det 0 cour nc cour -2 obj 0 des prep des -1 dep 0 casernes nc caserne -1 obj 0 . poncts . -6 eos 1 et coo et 0 root 0 le det le 1 det 0 vent nc vent 3 suj 0 du prep du -1 dep 0 matin nc matin -1 obj 0 soufflait v souffler -5 dep_coord 0 sur prep sur -1 mod 0 les det le 1 det 0 lanternes nc lanterne -2 obj 0 . poncts . -9 eos 1