|
|
# Multi Column Files Format
|
|
|
# Multi Column Format
|
|
|
|
|
|
The Multi Column File (mcf) format is the text format used to represent textual and its annotations. Every line of an mcf corresponds to an atomic unit of text (abusively called word). Each column describes an attribute of the atomic token. Columns are separated by tab characters. The number of columns in an mcf is unbounded. Columns can be associated to a **label** via an [mcd](mcd) file. The association of a column to a [label] (column_labels) allows to access the content of each column through [Word Features](features)
|
|
|
The Multi Column Format (mcf) is the text format of the files used to represent textual and its annotations. Every line of an mcf corresponds to an atomic unit of text (abusively called word). Each column describes an attribute of the atomic token. Columns are separated by tab characters. The number of columns in an mcf is unbounded. Columns can be associated to a **label** via an [mcd](mcd) file. The association of a column to a [label] (column_labels) allows to access the content of each column through [Word Features](features)
|
|
|
|
|
|
Here is an example of two sentences represented as an mcf. The first column corresponds to **FORM**, the second to **POS**, the third to **LEMMA** the fourth to **GOV** the fifth to **LABEL** and the last to **SENT_SEG**
|
|
|
|
... | ... | |