alexis.nasr created page: mcf authored by Alexis Nasr's avatar Alexis Nasr
...@@ -4,15 +4,17 @@ The Multi Column Files (mcf) format is the text format used to represent text an ...@@ -4,15 +4,17 @@ The Multi Column Files (mcf) format is the text format used to represent text an
The list of labels is the following: The list of labels is the following:
* **FORM** * **FORM** form of the word
* **CPOS** * **CPOS** coarse part of speech
* **POS** * **POS** part of speech
* **LEMMA** * **LEMMA** lemma
* **FEATS** * **FEATS** other linguistic features (usually morphological)
* **GOV** * **GOV** relative position of the governor (-n indicates that the governor is n words to the left, n indicates that it is n words to the right)
* **LABEL** * **LABEL** label of the syntactic dependency
* **SENT_SEG** * **SENT_SEG** indicates that the word is the last word in the sentence
* **A** to **Z** * **A** to **Z** other labels used to represent other useful information (word duration, speaker, ...)
Here is an example of two sentences represented as an mcf. The first column corresponds to **FORM**, the second to **POS**, the third to **LEMMA** the fourth to **GOV** the fifth to **LABEL** and the last to *SENT_SEG**
la | det | le | 1 | det | 0 la | det | le | 1 | det | 0
---- | ---- | ---- | ---- | ---- | ---- ---- | ---- | ---- | ---- | ---- | ----
... ...
......