# Training

The easiest way to train a [Reading Machine](readingMachine.md) is to use the scripts provided by *macaon_data*.\
For example, if one would like to train a *parser* called *myFrenchParser* on the UD treebank French-GSD :\
`$ cd macaon_data/UD_any`\
`$ ./prepareExperiment.sh UD_French-GSD parser myFrenchParser`\
`$ ./train.sh tsv bin/myFrenchParser`

## prepareExperiment.​sh

The purpose of this script is simply to generate a new experiment directory inside *bin/*.\
The usage is `./prepareExperiment corpusName templateName experimentName`.

## train.​sh

This script will your model by calling `macaon train` with the correct arguments.\
The usage is `./train.sh mode experimentPath arguments`, where :
* mode is txt if your model does tokenization, tsv if it doesn't.
* experimentPath is the relative path to your model.
* arguments is a list of arguments to give to `macaon train`, it can be empty.

Example : `$ ./train.sh tsv bin/myFrenchParser -n 30 --batchSize 128`.

For a list of available arguments execute `macaon train -h`.\
You can inspect how a model has been trained by looking at the file `bin/yourModel/train.info`.\
You can stop training and resume it anytime you want, thus allowing to increase the number of epoch.

## evaluate.​sh

This script will evaluate your trained model against the test corpora using the official [CoNLL 2018 Shared Task](http://universaldependencies.org/conll18/evaluation.html) eval script.\
Under the hood it is a call to `macaon decode`, the usage is `./evaluate.sh mode experimentPath arguments`, where :
* mode is txt if your model does tokenization, tsv if it doesn't.
* experimentPath is the relative path to your model.
* arguments is a list of arguments to give to `macaon decode`, it can be empty.

Example : `$ ./evaluate.sh tsv bin/myFrenchParser`

## Using your trained model

Once a model has been trained, you can use it to annotate text.\
If your model doesn't do tokenization, your input file must be formated in the [CoNLL-U Plus Format](https://universaldependencies.org/ext-format.html). Otherwise, your input file must be raw utf8 text.\
To use your trained model `myFrenchParser` to annotate the text in the file `myFrenchFile.conllu` :
* `$ macaon decode --model bin/myFrenchParser --inputTSV myFrenchFile.conllu`

The annotated file will be printed to the standard output.

[Back to main page](../README.md)