@@ -59,11 +59,11 @@ Similarly to _PARSEME:MWE_, the information in the 11th column called _PARSEME-F
Here is an example of sentence using the PARSEME-FR _cupt_ format described above,
showing only columns 1 (ID), 2 (FORM) and 11 (MWE / NE annotation).
showing only columns 1 (ID), 2 (FORM) and 11 (MWE / NE annotation). The sentence contains 8 MWEs/NEs, which we comment below:
E.g. "Peugeot" is annotated as a final ORG named entity (NE-ORG.final), with identifier 2, and also as a primary PERS named entity with identifier 1.
- id 1 and 2 : token 2 "Peugeot" is annotated as a final ORG named entity (NE-ORG.final), with identifier 2, and also as a primary PERS named entity with identifier 1.
"tout au plus" is annotated as a MWE, more precisely tokens "tout", "à", "le" and "plus" are annotated with identifier 3 ("au" is a multi-word token which is not annotated). It has "ADV" as part-of-speech, meaning it behaves as an adverb, but it is considered as irregular from the syntactic point of view. The criterion that was used to annotate it is "IRREG".
- id 3 : "tout au plus" is annotated as a MWE, more precisely tokens "tout", "à", "le" and "plus" are annotated with identifier 3 ("au" is a multi-word token which is not annotated). It has "ADV" as part-of-speech, meaning it behaves as an adverb, but it is considered as irregular from the syntactic point of view. The criterion that was used to annotate it is "IRREG", hence the annotation 3:ADV|MWE|IRREG on the first token of the MWE ("tout").
The sentence contains an example of a word (the support verb "effectuait") belonging to two LVCs: the LVC with id 6 contains tokens 21 and 23, and the LVC with id 7 contains the tokens 21 and 26.