This folder contains the training and evaluation code for CVA-MVSNet.
Create a conda environment and install requirements
conda create -n tandem python=3.7
conda activate tandem
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt
Comment: The environment uses PyTorch 1.5.1
and PyTorch-Lightning 0.7.6
because this was the environment we used for development. However, we don't use features specific to this PyTorch version and thus an upgrade should hopefully be easy. On the other hand, PyTorch-Lightning introduced significant breaking changes and upgrading could be more cumbersome. Our model is defined in models/cva_mvsnet.py
and wrapped by PyTorch-Lightning in models/tandem.py
s.t. removing PyTorch-Lightning from the project should be possible.
Reproducibility: The requirements.txt
does not contain version numbers for packages like numpy
where we do not expect breaking changes. For reproducibility we include reproducibility/requirements_conda.txt
and reproducibility/requirements_pip.txt
with exact version numbers that were generated by pip freeze
and conda list --export
, respectively.
Config values are documented in config/default.yaml
. In config/abl0*.yaml
we include config files for the models shown in the ablation study (Table 2). You can start a training with
export TANDEM_DATA_DIR=/path/to/downloaded/tandem_replica
python train.py --config config/default.yaml path/to/out/folder DATA.ROOT_DIR $TANDEM_DATA_DIR
where path/to/out/folder
may not exist and will contain tensorboard logs and checkpoints. You can overwrite more config parameters by listing them at the end of the command, e.g. TRAIN.EPOCHS 100
.
For convenience we include pretrained models in pretrained/ablation
and their evaluation numbers on TANDEM replica.
The pretrained models can be evaluated with
export TANDEM_DATA_DIR=/path/to/downloaded/tandem_replica
./eval.sh
Note: This will overwrite the *.txt
and *.pkl
files in pretrained/ablation
.
The export script is located at export_model.py
. Due to internals of PyTorch, one has to specify the resolution and the number of frames used during inference. This could potentially be avoided with some code changes. The script also exports predictions on sample inputs, so that the C++ wrapper can check if the export was correct. It also generates out_dir/depth.png
and out_dir/confidence.png
, which can be inspected to ensure correctness. There are multiple optimization options for the jit'ed model and presumably new PyTorch versions will add more. One should experiment on their system, which option gives the best performance.
mkdir -p cpp_exported_models
python export_model.py --data_dir $cpp_exported_models --out_dir cpp_exported_models --model pretrained/ablation/abl04_fewer_depth_planes.ckpt --height 480 --width 640 --view_num 7 --jit_freeze --jit_run_frozen_optimizations
The exported model can be checked with a C++ executable built by the TANDEM code: ./tandem/build/libdr/dr_mvsnet/bin/dr_mvsnet_test cpp_exported_models/model.pt cpp_exported_models/sample_inputs.pt 10
. The last argument (here 10
) gives the number of repetitions used for benchmarking the performance. These numbers are more reliable than using --profile
for export_model.py
because it uses the same C++ code used for TANDEM.
We include the sbatch
scripts we used on our SLURM cluster, but these might have to be adapted for a different setup.
We thank Xiaodong Gu, Zhiwen Fan, Zuozhuo Dai, Siyu Zhu, Feitong Tan, and Ping Tan for open sourcing their excellent work "Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching".