Replies: 1 comment 1 reply
-
glad you're having success with it! i'm not sure how you managed to get MSW-MSA attention speeding up SDXL though, it didn't seem like there was anything too special in your workflow related to that. it might just be that downscaling is its own acceleration, so if you see a performance increase it may just be from the shrink effects.
interesting. i actually came up with my own version of that independently: https://github.com/blepping/ComfyUI-bleh#blehdeepshrink works a bit differently, it scales the downscale between sampling percentages. i haven't used the node recently though, i think the highdiffusion approach works better in general.
if you have the bleh nodes i linked above you'll automatically get a bunch more upscale options
not easily possible in this case unfortunately. the main downscale method uses conv dilation which i believe only works for 2x, so you could use 2x, 4x, 8x, 16x or whatever but you can do 3x, 5x or stuff in between like normal interpolation. of course the upscaling part just has to return the tensor to the original size so you don't have flexibility there either. what you could possibly do is instead of enabling the "CA" stuff in the RAUNet node is target those layers with Deep Shrink (either "gradient" version). the CA bit is basically not doing anything different from Deep Shrink - only thing is it uses average pooling to downscale. i actually could add that to bleh so you could choose that option in other stuff if you wanted to. by the way, i see you're using SamplerSupreme. you might also be interested in this project: https://github.com/blepping/comfyui_overly_complicated_sampling - basically ripping off SamplerSupreme to let you add substeps with nodes (so you can configure the substeps in any order - if you want euler, bogacki, reversible heun, euler as your substeps that's possible). also has a bunch of different methods to combine them include some that massive accelerate the process and can make using 50+ substeps practical. very, very experimental and rough currently. |
Beta Was this translation helpful? Give feedback.
-
I'm not sure what settings you had off or something in your tests. I use SDXL almost exclusively. I've been using the Kohya Deep Shrink node, but it gives some problems with hands and other things from the artifacts from the latent, and then it becomes a balancing act between allowing it to generate far enough to get good definition, and allowing it enough time to fix the doubling from the latent.
Apparently Kohya Deep Shrink (or so I have heard) is some form of implementation of ScaleCrafter. I also tried the gradient version, which seemed to help a little bit, but not completely, though I have not tried tweaking things with that too much.
Then I found the HiDiffusion repo and your code, and tried that. It works fantastic on SDXL, and I am not seeing any of the artifacting issues you were having:
(Sorry in advance if you drag that workflow into comfy - my workflow is very very messy!)
In addition, using your node with SDXL decreases my gen time by roughly 35%, which is fantastic. There is still some latent doubling artifacts but it seems to be less, and I still need to tweak settings more.
All this to say - thank you for doing this! I understand how hacky it is (I looked through the code) but I can confirm that the results are pretty great!
It would be interesting to see if doing some form of gradient like the Deep Shrink modification above would help with the latent doubling, but I do not know enough about how SD works yet to figure out how to do that.
Beta Was this translation helpful? Give feedback.
All reactions