Skip to content
/ MAT-MMT Public

Our code for ICMR'22 Oral paper "HybridVocab: Towards Multi-Modal Machine Translation via Multi-Aspect Alignment".

License

Notifications You must be signed in to change notification settings

pengr/MAT-MMT

Repository files navigation

HybridVocab: Towards Multi-Modal Machine Translation via Multi-Aspect Alignment [Paper]

Step1: Requirements

  • Build running environment (two ways)
  1. pip install --editable .  
  2. python setup.py build_ext --inplace
  • Install the syntax parser
  pip install Stanza 1.2.2 Stanza_batch 0.2.2
  • pytorch==1.7.0, torchvision==0.8.0, cudatoolkit=10.1 (pip install is also work)
  conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.1 -c pytorch 

Step2: Data Preparation

The dataset used in this work is Multi30K, both its original and preprocessed versions (that I used) are available at here.

You can download your own data set and then refer to experiments/prepare-iwslt14.sh or experiments/prepare-wmt14en2de.sh to pre-process the data set.

File Name Description Download
resnet50-avgpool.npy pre-extracted image features, each image is represented as a 2048-dimensional vector. Link
Multi30K EN-DE Task BPE+TOK text, Image Index, Label for English-German task (including train, val, test2016/17/mscoco) Link
Multi30K EN-FR Task BPE+TOK text, Image Index, Label for English-French task (including train, val, test2016/17/mscoco) Link

Step3: Running code

You can let this code works by run the scripts in the directory expriments.

  1. preprocess dataset into torch type

    bash pre.sh
  2. train model

    bash train.sh
  3. generate target sentence

    bash gen.sh

Citation

If you use the code in your research, please cite:

@inproceedings{peng2022hybridvocab,
    title={HybridVocab: Towards Multi-Modal Machine Translation via Multi-Aspect Alignment},
    author={Peng, Ru and Zeng, Yawen and Zhao, Junbo},
    booktitle={Proceedings of the 2022 International Conference on Multimedia Retrieval},
    pages={380--388},
    year={2022}
}

About

Our code for ICMR'22 Oral paper "HybridVocab: Towards Multi-Modal Machine Translation via Multi-Aspect Alignment".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages