Skip to content

v1.2.1 - free lunch edition

Compare
Choose a tag to compare
@bghira bghira released this 03 Dec 23:17
· 167 commits to release since this release
07d9ea7

Features

This release will speed up all validations without any config changes.

  • SageAttention (NVIDIA-only; must be installed manually for now)
    • By default, only speeds up inference. SDXL more than Flux due to differences in their respective bottlenecks.
    • Use --attention_mechanism=sageattention to enable this, and --sageattention_usage=training+inference to enable it for training as well as validations. This will probably make your model worse or collapse though.
  • Optimised --gradient_checkpointing implementation
    • No longer applies during validations, so even without SageAttention we get a speedup (on a 4090+5800X3D) from 29 seconds for a Flux image to 15 seconds (SDXL goes from 15 seconds to 6 seconds)
  • Added --gradient_checkpointing_interval which you can use to speed up Flux training at the cost of some additional VRAM.
    • Makes NF4 even more attractive for a 4090, where you can then use the SOAP optimiser in a meaningful way.
    • See the options guide for more information.

What's Changed

Full Changelog: v1.2...v1.2.1