Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with Training Gaussian Splatting on Higher Resolution Images #1146

Open
rab306 opened this issue Jan 29, 2025 · 3 comments
Open

Issues with Training Gaussian Splatting on Higher Resolution Images #1146

rab306 opened this issue Jan 29, 2025 · 3 comments

Comments

@rab306
Copy link

rab306 commented Jan 29, 2025

Hi,
I've been training the Gaussian Splatting model on drone images for the past few weeks, initially downsampling them to a resolution of (1024x683). Recently, I attempted to increase the resolution to (2048x1365), but I’m encountering an issue where the results for the higher resolution images are consistently worse than the ones from the lower resolution.
To avoid running out of memory, I’m loading the higher resolution images in batches, and I also tested this batch approach with the lower resolution images, where I got similar results to the original training script. So, I don’t believe there’s an issue with the batch loading process itself.
I’ve tried various fine-tuning approaches, but I haven’t been able to recognize any consistent pattern. For example, using the default values gave me the worst results, while reducing some learning rates (such as the initial and final position learning rates) produced better results. However, further reducing these rates made the results worse again. Additionally, reducing the densification interval to 75 didn’t help either and actually worsened the results compared to using the default interval of 100.

Is there a specific approach I should follow when training on higher resolution images (anything above 1600x1000)? I’d really appreciate any advice or recommendations on how to improve the performance for high-res images.

@jaco001
Copy link

jaco001 commented Jan 29, 2025

I was trying 2400 and 3200pix. 2400 give me better resoult (little more fine details). 3200 is a mix bag. With more time and VRAM vs effect is diminishing returns.
I wonder why you don't use (you are already rescale images) 1600pix at start? (seems that starting parameters are best for it)

@rab306
Copy link
Author

rab306 commented Jan 29, 2025

I will experiment with 1600 pixel next time, thanks for mentioning this point but what is the aspect ratio you used in the data and does it have any effects?
what were the fine-tuning for 2400 pixel?

@jaco001
Copy link

jaco001 commented Jan 29, 2025

  1. Aspect ratio: 4:3, 16:9 ... but this don't have much impact (wide is better for tracking in convert.py). So I only put long side of image here. Colmap still give you something like eg.1608 cus it remove distortion of the lenses. You can tinkering little there with few parameters to keep exact size, but it takes time and extra effort.
  2. For 2400 just image size and -r 1 parameter in training. My images ranges are from 50-500 so I don't need to change learning rates and denoising.
    If your dorne footage is about 1000+ try calculate how much one image should be taken to consideration by train.py . For 100 pic ratio is about 300.. but with 100 can be decent. At standard half of iteration is denoise. Here I will be add more steps here. For eg 100k steps left denoise for like 75k.
    Another way that you can go is... putting trained model again to training (like input sparse model). This can also improve quality just for only cost like extra training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants