Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network seems to fail to converge on ShapeNet #3

Open
nikwl opened this issue Jan 7, 2022 · 1 comment
Open

Network seems to fail to converge on ShapeNet #3

nikwl opened this issue Jan 7, 2022 · 1 comment

Comments

@nikwl
Copy link

nikwl commented Jan 7, 2022

Hi! Thanks again for publishing your code online.
I'm trying to test your network on some classes from shapenet by training a separate network for each class, and I'm getting some very poor results (and also inconsistent results when using the GPU). I think this might be because of a library issue or maybe because the weights are being saved incorrectly? I wanted to post my results here and ask for any guidance you might have.

I've trained two versions of 3D-ORGAN on the mugs dataset from ShapeNet. For training I have 854 instances. I'm using a batch size of 170, and training for 400 epochs with all the default hyperparameters. The training data looks like this:

Target Shape Source Shape
mug00_L00 mug00_L01

Training on the CPU (which takes around 2 days) produces predicted outputs that look like this:

Input Shape Predicted Shape
mug_pred01 mug_pred02

Training on the GPU (which takes around 2 hours) produces predicted outputs that look like this:

Input Shape Predicted Shape
mug_pred_gpu03 mug_pred_gpu04

These examples that I'm showing are pretty typical for the remainder of the samples in the dataset. I'm a little worried why there's such a drastic difference between training on the GPU and CPU - I feel like I may have installed some of the GPU libraries incorrectly to make the GPU version fail so badly. The libraries I'm using can be found in my previous issue (#2).

I also tried training on the airplanes dataset from ShapeNet with the GPU only. For training I have 3141 instances. I'm using a batch size of 314, and training for 400 epochs with all the default hyperparameters. Predicted outputs look like this:

Input Shape Predicted Shape
plane_pred_gpu00_L01 plane_pred_gpu00_L00

So it seems to be essentially just predicting the input without adding anything. Again, the example that I'm showing is pretty typical for the remainder of the samples in the dataset. This is also worrisome because it seems very much different from the prediction for mugs.

For reference, here's the code I'm using to generate these images. I've made a few changes to the dataloader as my test data contains objects that have been broken ahead of time. If necessary I can include all the code that I've altered. I haven't made any changes to how training is performed.

# Load data
model._load_full_test_set()

# Just extract the ones we want to reconstruct
reconstruct_list = [0]
fractured_voxels, _, labels = model.full_test_data
fractured_voxels = fractured_voxels[reconstruct_list, ...]
print(fractured_voxels.min(), fractured_voxels.max()) # >>> -1.0 1.0

# Predict for 2 iterations
predicted_voxels1 = model.predict(fractured_voxels, labels)
predicted_voxels2 = model.predict(predicted_voxels1, labels)
print(predicted_voxels2.min(), predicted_voxels2.max()) # >>> -0.99999666 1.0

# Convert to 0, 1 space
fractured_voxels = model.format_output(fractured_voxels).astype(float)
predicted_voxels2 = model.format_output(predicted_voxels2).astype(float)
print(predicted_voxels2.min(), predicted_voxels2.max()) # >>> 0.0 1.0

trimesh.voxel.VoxelGrid(
    fractured_voxels[0, ...],
).as_boxes().export("model_fractured.ply")
trimesh.voxel.VoxelGrid(
    predicted_voxels2[0, ...],
).as_boxes().export("model_complete.ply")

Would you be able to post your original environment, or give any guidance on what you think may be causing these issues? I'd like to predict on a few other classes using your method.

Thanks very much,

@Tanmay-2106
Copy link

Hi, could you please provide me the ModelNet, 3DPotteryDataset and Larco museum dataset? It would be of great help.

Thanks!

@nikwl nikwl changed the title Suspicously poor outputs Network seems to fail to converge on ShapeNet Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants