Add support for resume training from network pkl in run_training #6

tripzero · 2020-02-06T22:54:14Z

No description provided.

woctezuma · 2020-02-14T21:25:10Z

If this is what I think it is, I wish the pull request was accepted.

However, I think the reason why it might not be accepted is that the training schedule and the reporting are affected by two other variables, which should be provided by the user when the training is resumed:

resume_pkl  = None,     # Network pickle to resume training from, None = train from scratch.
resume_kimg = 0.0,      # Assumed training progress at the beginning. Affects reporting and training schedule.
resume_time = 0.0,      # Assumed wallclock time at the beginning. Affects reporting.

Reference: here.

vsemecky

Great. This is exactly what I missed.

ayush9198gupta · 2020-04-20T07:43:14Z

Hi,
I am using and implementing stylegan2 model in google colab (Pro version) wherein i have loaded pretrained network file (stylegan2-ffhq-config-f.pkl) but when i am executing the code it automatically increases my colab RAM and session get restarted so please help me with the solution for this issue.

sharing my Colab setting which i am following now:
Hardware accelerator - GPU
Runtime shape - HIGH-RAM

Second issue is i have loaded lastet pickle file of stylegan model into stylegan2 model with transfer learning with after execution it is not saving the results.I have meade following changes in training_loop.py file.

#load the latest pickle file generated from stylegan model:
resume_pkl = './results/00003-stylegan2-anime_images-1gpu-config-b/network-snapshot-012672.pkl',

#To save the output after every epcoh , here i made changes :
image_snapshot_ticks = 1,
network_snapshot_ticks = 1,

Kindly revert back for this two issues.
Thanks

ahmedshingaly · 2020-05-27T07:41:44Z

CUDA_ERROR_OUT_OF_MEMORY
i have one GPU 2080 Ti yet I get above error
how can I train using one GPU and where to reduce batch size?

obravo7 · 2020-06-05T01:25:42Z

@ahmedshingaly
You might be out of luck. Are you trying to train config-f at full resolution (1024x1024)? If so the CUDA_ERROR_OUT_OF_MEMORY is to be expected. The authors state that

Note that training FFHQ at 1024×1024 resolution requires GPU(s) with at least 16 GB of memory.

and I believe a 2080 Ti has 12 GB of memory.

ahmedshingaly · 2021-01-18T06:08:31Z

@ahmedshingaly
You might be out of luck. Are you trying to train config-f at full resolution (1024x1024)? If so the CUDA_ERROR_OUT_OF_MEMORY is to be expected. The authors state that
Note that training FFHQ at 1024×1024 resolution requires GPU(s) with at least 16 GB of memory.
and I believe a 2080 Ti has 12 GB of memory.

you are right, I am using a custom dataset, and still error persists. will try to run on google collab and see if it gives a different result

Add support for resume training from network pkl in run_training

a1c9dca

vsemecky approved these changes Mar 10, 2020

View reviewed changes

ashirviskas mentioned this pull request Apr 28, 2020

Added resume training args to run_training #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for resume training from network pkl in run_training #6

Add support for resume training from network pkl in run_training #6

tripzero commented Feb 6, 2020

woctezuma commented Feb 14, 2020 •

edited

Loading

vsemecky left a comment

ayush9198gupta commented Apr 20, 2020 •

edited

Loading

ahmedshingaly commented May 27, 2020

obravo7 commented Jun 5, 2020

ahmedshingaly commented Jan 18, 2021

Add support for resume training from network pkl in run_training #6

Are you sure you want to change the base?

Add support for resume training from network pkl in run_training #6

Conversation

tripzero commented Feb 6, 2020

woctezuma commented Feb 14, 2020 • edited Loading

vsemecky left a comment

Choose a reason for hiding this comment

ayush9198gupta commented Apr 20, 2020 • edited Loading

ahmedshingaly commented May 27, 2020

obravo7 commented Jun 5, 2020

ahmedshingaly commented Jan 18, 2021

woctezuma commented Feb 14, 2020 •

edited

Loading

ayush9198gupta commented Apr 20, 2020 •

edited

Loading