Skip to content
Snippets Groups Projects
Commit 3252a942 authored by Elie Antoine's avatar Elie Antoine
Browse files

Adding data and README

parents
No related branches found
No related tags found
No related merge requests found
# README
## JSON Structure
The JSON object contains the following fields:
- `lu`: The linguistic unit (LU) in the sentence.
- `pos_lu`: The part of speech tag of the LU. (**corresponding to *f<sub>trigger</sub>***).
- `lemma_lu`: The lemma or root form of the LU.
- `frame`: The semantic frame associated with the LU.
- `question`: The text of the question.
- `id`: A unique identifier for the question-answer pair.
- `answers`: A list of dictionaries containing the reference answers with the following fields:
- `text`: The text of the reference answer.
- `role`: The semantic role of the answer.
- `answer_start`: The starting character offset of the answer in the context.
- `answer_end`: The ending character offset of the answer in the context.
- `coref`: A dictionary for coreference information, with `anchor` and `mentions` fields.
- `wrong_answer`: The incorrect reference answer if there was a correction made.
- `predictions`: A dictionary containing model predictions and corresponding ROUGE-L scores. Each model has an entry with:
- `answer_pred`: The predicted answer by the model.
- `rougeL`: The ROUGE-L score of the prediction.
- `human_annot`: A dictionary containing human annotations for each model's output. Each model has an entry which is a list of annotations:
- `annot`: The annotation identifier.
- `rating`: The rating given by the human annotator (e.g., "Correct").
- `lu_in_question`: Boolean corresponding to whether the trigger is found in the question for this example (**corresponding to *f<sub>LU in q</sub>***).
- `nb_fe_frame`: Number of Frame Elements in the frame that triggered the question. (**corresponding to *f<sub>nb FEs</sub>***).
- `list_dep_lu_ans`: Detail of dependencies crossed between response and frame trigger.
- `nb_arc_lu_ans`: Number of dependency arcs between the answer and the trigger of the question's frame. (**corresponding to *f<sub>dist</sub>***).
- `entropy_frame`: Entropy of the question's frame, common to all the examples of this frame. (**corresponding to *f<sub>entropy</sub>***).
- `complexity_vector` : Each element corresponds to a complexity factor, 1 if it's "active" and the example therefore corresponds to the difficult group, 0 otherwise. Indexes correspond to the following complexity factors:
- `0`: ***f<sub>LU in q</sub>***
- `1`: ***f<sub>trigger/sub>***
- `2`: ***f<sub>dist</sub>***
- `3`: ***f<sub>entropy</sub>***
- `4`: ***f<sub>nb FEs</sub>***
## Example
Here is an example of the JSON structure:
```json
{
"lu": "devient",
"pos_lu": "AUXE",
"lemma_lu": "devenir",
"frame": "Becoming",
"question": "Quel type d'État devient l'Irlande ?",
"id": "8abee7c1-e632-4168-8a9c-225eb7e15f43",
"answers": [
{
"text": "un état souverain et indépendant",
"role": "Final_category",
"answer_start": 279,
"answer_end": 311,
"coref": {
"anchor": {},
"mentions": []
},
"wrong_answer": ""
}
],
"predictions": {
"MT5-large_260_AP0": {
"answer_pred": "état souverain et indépendant",
"rougeL": 1.0
}
},
"human_annot": {
"MT5-large": [
{
"annot": "annot_1",
"rating": "Correct"
}
]
},
"lu_in_question": true,
"nb_fe_frame": 2,
"list_dep_lu_ans": [
"conj",
"obj"
],
"nb_arc_lu_ans": 1,
"complexity_vector": [
0,
1,
0,
0,
1
]
}
```
\ No newline at end of file
Source diff could not be displayed: it is too large. Options to address this: view the blob.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment