... | ... | @@ -38,7 +38,7 @@ Similarly to _PARSEME:MWE_, the information in the 11th column called _PARSEME-F |
|
|
- for the (linearly) first component of a MWE/NE, the code consists of an identifier followed by a colon ':' and a **LABEL**:
|
|
|
* **LABELS** provide information about the MWE/NE and are composed of a **POS** field, a **CATEGORY** field and **CRITERIA** field separated by a pipe '|' character for (i.e. POS|CATEGORY|CRITERION1,CRITERION2..., for instance ADP|MWE|IRREG describes a MWE (not a NE), whose part of speech is ADP, and for which the criterion IRREG has been used):
|
|
|
|
|
|
1. **POS** is a tag representing the part of speech of the whole MWE/NE, using the tagset of the syntactic annotation scheme (either UD or FTBdep), except for some MWEs that were classified as syntactically regular, in which case the POS is irrelevant ("_" is used). Please refer to the [page describing the heuristics used to classify MWEs as regular / irregular, and to assign the POS to irregular ones](reg-irreg-pos-heuristics) for details.
|
|
|
1. **POS** is a tag representing the part of speech of the whole MWE/NE, using the tagset of the syntactic annotation scheme (either UD or FTBdep), except for some MWEs that were classified as syntactically regular, in which case the POS is irrelevant ("_" is used). For details please refer to the [interaction page](https://gitlab.lis-lab.fr/PARSEME-FR/PARSEME-FR-public/wikis/Interaction-between-syntactic-annotation-and-MWE/Interaction-between-syntactic-annotation-and-MWE-status) for the heuristics used to classify MWEs as regular / irregular, and to assign the POS to irregular ones].
|
|
|
- Named entities always have the POS for proper nouns.
|
|
|
2. **CATEGORY** corresponds to the type of unit being annotated. The category either starts by "NE" for named entities, or "MWE" for multi-word expressions that are not named entities.
|
|
|
* MWE is used for non-verbal MWEs
|
... | ... | |