This is the source code of DeepLiDARFlow for dense scene flow estimation using monocular camera and sparse LiDAR.
The model is trained and tested on:
- Windows 10 Pro and/or Ubuntu 18.04
- Python 3.7.8
- gast==0.2.2 (to resolve incompatibilities between python version and tensorflow)
- imageio==2.5.0
- matplotlib==3.2.1
- numpy==1.16.4
- tensorflow-gpu==1.14
- tqdm==4.31.1
- All required libraries can be installed by running
pip install -r requirements.txt
.
- The original form of FlyingThings3D (FT3D) is used for pre-training and KITTI 2015 scene-flow is used for fine-tuning. All mappings have been done.
- For KITTI data set, downloading the de-warped disparities 1 to the view at time t+1 is required by running
sh download_preprocessed_kitti.sh
. Then, the folder should be placed in the KITTI data set (training) folder.
- Running
sh run_train.sh
will train the model from scratch. As a prerequisite, the correct path of FlyingThings3D data set should be provided for--data_path
flag. - Running
sh run_finetune.sh
will fine-tune the model. As a prerequisites, the correct path of KITTI data set should be provided for--data_path
flag and the pre-trained model path on FlyingThings3D should be provided for--pretrained_model
flag. - During the training, best weights will be stored in the model directory and make a JSON file named
best_checkpoints
. If the best_checkpoints is available and the flag--best_checkpoint
is True, the training will continue with the best checkpoint.
- For pre-training on FlyingThings3D, the maximum epochs number is defined with 600 epochs. The initial learning rate is 0.0001 and it's reduced to 0.00001 in case of over-fitting.
- For fine-tuning on KITTI, the model needs to be fine-tuned with 200 epochs. First 100 epochs are trained using 0.0001 learning rate, then it's reduced to 0.00001 for the rest epochs.
- If the training or fine-tuning will be continued after changing learning rates or maximum epochs number, the
--best_checkpoint
flag should be activated to call the best weights for continuing the training.
- Running
sh run_evaluation.sh
will evaluate the whole frames in the test split of FlyingThings3D and the mapped frames of KITTI in 'KITT_test.txt'.- The number of LiDAR samples can be estimated by setting the
--samples
flag [default: 5000]. - The data set type (either 'KITTI' or 'FT3D' should be provided for the flag
--dataset
. - The path of the data set should be given for
--data_path
. - The model path should be given for
--model_path
[default: 'model/DeepLiDARFlow'].
The software can take the suitable model based on data set type--dataset
.
- The number of LiDAR samples can be estimated by setting the
- Running
sh run_inference.sh
can evaluate the model on two selected frames inside 'images' folder (one is KITTI example and another is FT3D).- 5000 of LiDAR points are considered with these frames.
- The example flag
--ex
can be specified either by '1' for KITTI frame or '2' for FT3D frame. - The model path should be given for
--model_path
[default: 'model/DeepLiDARFlow'] and the software can take the suitable model based on image resolution.
- The execution of
download_models.sh
can download the weights for both data sets (FT3D/KITTI) inside the 'model' folder:
model/DeepLiDARFlow-FT3D
model/DeepLiDARFlow-KITTI
If you find the code or the paper useful, consider citing us:
@inproceedings{DeepLiDARFlow2020,
title={{DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation Using Monocular Camera and Sparse LiDAR}},
author={Rishav and Battrawy, Ramy and Schuster, Ren{\'e} and Wasenm{\"u}ller, Oliver and Stricker, Didier},
booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2020},
}