GPUs and GPU usage #39
-
Hello authors, thank you for your great repo and for ContraGAN! I had a couple of quick questions:
Thank you so much for your help! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hi. I am sorry for the late reply.
Thank you:) Best, Minguk |
Beta Was this translation helpful? Give feedback.
-
Hi Minguk, Thank you for the response! I really appreciate the detailed answers! Also your updates to the repo today were great. To clarify, what were the exact commands used to train the BigGAN-256 and BigGAN-2048 models? If I'm going to train one of these for 3 or 4 weeks, I want to get the command right :) For the models you have trained with 4/8 GPUs, did you use DP or DDP? From your previous response I gather that you used DP -- have you had success training models with DDP? Have you had any success with mixed precision training? Also, when you use standing statistics for evaluation, Best, |
Beta Was this translation helpful? Give feedback.
Hi.
I am sorry for the late reply.
How many GPUs do you use to train the pre-trained models provided in the README (especially the BigGAN-2048 on ImageNet)?
=>
Models trained on CIFAR10: 1 GPU (2080Ti, RTX-TITAN, V100, A100, etc.)
Models trained on Tiny_ImageNet: RTX-TITAN x 1 (From DCGAN to SAGAN), RTX-TITAN x 4 (From BigGAN to ContraGAN +ADA)
Models trained on ImageNet: V100 32GB x 4 with Sync_BN (From SNGAN to BigGAN 256 B.S.), V100 32GB x 8 with Sync_BN and DP (BigGAN 2048 B.S., training takes almost a month)
I'm finding that GPU utilization is quite low when using multiple GPUs
=> yes, it is because you might train the model using DataParallel (DP)
If you train a model using Dist…