You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thank you for providing all the code and the hyperparameter specifications for the experiments. Unfortunately, I'm having issues reproducing your results for the Glow architecture with MNIST and FashionMNIST. When I run the provided command, the model starts training but stops after 15 epochs with RuntimeError("Scale factor has NaN entries'). Before the error, the loss increases up to 2.8e+12.
It would be great if you could find why it fails or tell me what I am doing wrong 🙂
Traceback (most recent call last):
File "/nfs/homedirs/wildr/flows_ood/train_unsup.py", line 361, in <module>
train(epoch, net, trainloader, device, optimizer, loss_fn, args.max_grad_norm, writer,
File "/nfs/homedirs/wildr/flows_ood/train_unsup.py", line 68, in train
z = net(x)
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 153, in forward
return self.module(*inputs[0], **kwargs[0])
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/nfs/homedirs/wildr/flows_ood/flow_ssl/glow/glow.py", line 24, in forward
return self.body(x)
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/nfs/homedirs/wildr/flows_ood/flow_ssl/invertible/parts.py", line 121, in forward
return self.module1(x), self.module2(z)
File "/nfs/homedirs/wildr/anaconda3/envs/ood_flows/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/nfs/homedirs/wildr/flows_ood/flow_ssl/realnvp/coupling_layer.py", line 327, in forward
raise RuntimeError('Scale factor has NaN entries')
RuntimeError: Scale factor has NaN entries
The text was updated successfully, but these errors were encountered:
Hi,
thank you for providing all the code and the hyperparameter specifications for the experiments. Unfortunately, I'm having issues reproducing your results for the Glow architecture with MNIST and FashionMNIST. When I run the provided command, the model starts training but stops after 15 epochs with RuntimeError("Scale factor has NaN entries'). Before the error, the loss increases up to 2.8e+12.
It would be great if you could find why it fails or tell me what I am doing wrong 🙂
The text was updated successfully, but these errors were encountered: