Skip to content

Latest commit

 

History

History

t2i

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Nunchaku SANA Models

Text-to-Image Gradio Demo

python run_gradio.py
  • By default, the Gemma-2B model is loaded as a safety checker. To disable this feature and save GPU memory, use --no-safety-checker.
  • By default, only the INT4 DiT is loaded. Use -p int4 bf16 to add a BF16 DiT for side-by-side comparison, or -p bf16 to load only the BF16 model.

Command Line Inference

We provide a script, generate.py, that generates an image from a text prompt directly from the command line, similar to the demo. Simply run:

python generate.py --prompt "You Text Prompt"
  • The generated image will be saved as output.png by default. You can specify a different path using the -o or --output-path options.
  • By default, the script uses our INT4 model. To use the BF16 model instead, specify -p bf16.
  • You can adjust the number of inference steps and classifier-free guidance scale with -t and -g, respectively. The defaults are 20 steps and a guidance scale of 5.
  • In addition to the classifier-free guidance, you can also adjust the PAG guidance scale with --pag-scale. The default is 2.

Latency Benchmark

To measure the latency of our INT4 models, use the following command:

python latency.py
  • Adjust the number of inference steps and the guidance scale using -t and -g, respectively. The defaults are 20 steps and a guidance scale of 5.
  • You can also adjust the PAG guidance scale with --pag-scale. The default is 2.
  • By default, the script measures the end-to-end latency for generating a single image. To measure the latency of a single DiT forward step instead, use the --mode step flag.
  • Specify the number of warmup and test runs using --warmup-times and --test-times. The defaults are 2 warmup runs and 10 test runs.