EASIER gloss translation models

Attribution required

If you use any of my code, please cite this repository as follows:

@misc{mueller2022easier-gloss-translation-models,
    title={EASIER gloss translation models},
    author={M\"{u}ller, Mathias},
    howpublished={\url{https://github.com/bricksdont/easier-gloss-translation}},
    year={2022}
}

Basic setup

Create a venv:

./scripts/setup/create_venv.sh

Then install required software:

./scripts/setup/install.sh

If the BSL corpus is used as training data, BSLCP_USERNAME and BSLCP_USERNAME must be set as environment variables before submitting any runs.

Dry run

Try to create all files and run all scripts, but on CPU only and exit immediately without any actual computation:

./scripts/running/dry_run_baseline.sh

Run a bilingual baseline

Train a baseline system for DGS -> DE:

./scripts/running/run_baseline.sh

Train and evaluate all bilingual models

./scripts/running/run_bilingual_models.sh

Train and evaluate all multilingual models

./scripts/running/run_bilingual_models.sh

Define a custom run

Construct a new top-level file similar to the existing files in scripts/running. Most importantly, define how the training, dev and test data should be composed by assigning the variable language_pairs:

language_pairs=(
    "uhh de dgs_de"
    "bslcp en bsl"
)

The structure of each row in this array is [source corpus] [src] [trg], and there can be arbitrarily many rows.

Your custom running script must eventually call scripts/running/run_generic.sh
Set multilingual="true" if MT system needs an indication of desired target language (i.e. if there are several target languages)
If data from both UHH and BSLCP is used, set training_corpora="uhh bslcp"
Before training an actual model, set dry_run="true" to test your setup

Create and upload a summary of all experiment outcomes

./scripts/summaries/summarize.sh

Create result tables shown in the paper

https://colab.research.google.com/drive/1xDOkBI3yOoKk1CI_BBZoLtVWAaY0uhWd?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 297 Commits
api		api
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EASIER gloss translation models

Attribution required

Basic setup

Dry run

Run a bilingual baseline

Train and evaluate all bilingual models

Train and evaluate all multilingual models

Define a custom run

Create and upload a summary of all experiment outcomes

Create result tables shown in the paper

About

Releases

Packages

Languages

License

bricksdont/easier-gloss-translation

Folders and files

Latest commit

History

Repository files navigation

EASIER gloss translation models

Attribution required

Basic setup

Dry run

Run a bilingual baseline

Train and evaluate all bilingual models

Train and evaluate all multilingual models

Define a custom run

Create and upload a summary of all experiment outcomes

Create result tables shown in the paper

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages