SDXL results are good 👍 #13

qskousen · 2024-05-28T19:45:15Z

qskousen
May 28, 2024

I'm not sure what settings you had off or something in your tests. I use SDXL almost exclusively. I've been using the Kohya Deep Shrink node, but it gives some problems with hands and other things from the artifacts from the latent, and then it becomes a balancing act between allowing it to generate far enough to get good definition, and allowing it enough time to fix the doubling from the latent.

Apparently Kohya Deep Shrink (or so I have heard) is some form of implementation of ScaleCrafter. I also tried the gradient version, which seemed to help a little bit, but not completely, though I have not tried tweaking things with that too much.

Then I found the HiDiffusion repo and your code, and tried that. It works fantastic on SDXL, and I am not seeing any of the artifacting issues you were having:

(Sorry in advance if you drag that workflow into comfy - my workflow is very very messy!)

In addition, using your node with SDXL decreases my gen time by roughly 35%, which is fantastic. There is still some latent doubling artifacts but it seems to be less, and I still need to tweak settings more.

All this to say - thank you for doing this! I understand how hacky it is (I looked through the code) but I can confirm that the results are pretty great!

It would be interesting to see if doing some form of gradient like the Deep Shrink modification above would help with the latent doubling, but I do not know enough about how SD works yet to figure out how to do that.

blepping · 2024-06-01T00:31:37Z

blepping
Jun 1, 2024
Maintainer

glad you're having success with it! i'm not sure how you managed to get MSW-MSA attention speeding up SDXL though, it didn't seem like there was anything too special in your workflow related to that. it might just be that downscaling is its own acceleration, so if you see a performance increase it may just be from the shrink effects.

I also tried the gradient version,

interesting. i actually came up with my own version of that independently: https://github.com/blepping/ComfyUI-bleh#blehdeepshrink

works a bit differently, it scales the downscale between sampling percentages. i haven't used the node recently though, i think the highdiffusion approach works better in general.

There is still some latent doubling artifacts but it seems to be less, and I still need to tweak settings more.

if you have the bleh nodes i linked above you'll automatically get a bunch more upscale options

It would be interesting to see if doing some form of gradient like the Deep Shrink modification above would help with the latent doubling

not easily possible in this case unfortunately. the main downscale method uses conv dilation which i believe only works for 2x, so you could use 2x, 4x, 8x, 16x or whatever but you can do 3x, 5x or stuff in between like normal interpolation. of course the upscaling part just has to return the tensor to the original size so you don't have flexibility there either.

what you could possibly do is instead of enabling the "CA" stuff in the RAUNet node is target those layers with Deep Shrink (either "gradient" version). the CA bit is basically not doing anything different from Deep Shrink - only thing is it uses average pooling to downscale. i actually could add that to bleh so you could choose that option in other stuff if you wanted to.

by the way, i see you're using SamplerSupreme. you might also be interested in this project: https://github.com/blepping/comfyui_overly_complicated_sampling - basically ripping off SamplerSupreme to let you add substeps with nodes (so you can configure the substeps in any order - if you want euler, bogacki, reversible heun, euler as your substeps that's possible). also has a bunch of different methods to combine them include some that massive accelerate the process and can make using 50+ substeps practical. very, very experimental and rough currently.

1 reply

qskousen Jun 1, 2024
Author

i'm not sure how you managed to get MSW-MSA attention speeding up SDXL though, ... it might just be that downscaling is its own acceleration,

I didn't explain that very well - it's about a 35% decrease over deep shrink, not a normal gen, so I would expect the downscaling (which certainly is a speedup!) is already factored in for deep shrink. I'm not sure what makes it so much faster. I am using an AMD card on Linux, which may cause some differences.

Agreed that highdiffusion works better than deep shrink.

I will check out those other sampling nodes, I have been meaning to try those.

The part about not being able to do the gradient is mostly over my head, but I somewhat see where the problem would be, haha.

That is indeed some complicated sampling! The thing I think I like most about Supreme is the dynamic sampler, I am not 100% sure how that works under the hood but it has some pretty great result in my experience.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDXL results are good 👍 #13

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

SDXL results are good 👍 #13

qskousen May 28, 2024

Replies: 1 comment · 1 reply

blepping Jun 1, 2024 Maintainer

qskousen Jun 1, 2024 Author

qskousen
May 28, 2024

Replies: 1 comment 1 reply

blepping
Jun 1, 2024
Maintainer

qskousen Jun 1, 2024
Author