Skip to content

360CVGroup/PlanGen

Repository files navigation

👉 PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models

Runze He, Bo Cheng, Yuhang Ma, Qingxiang Jia, Shanyuan Liu, Ao Ma, Xiaoyu Wu, Liebucha Wu, Dawei Leng†, Yuhui Yin(✝Corresponding Author)


🔥 News

  • [2025/3/14] We initialized this github repository and released the code.
  • [2025/3/14] We released the paper of PlanGen.

🔧 Quick Start

1. Setup repository and environment

git clone https://github.com/360CVGroup/PlanGen.git
cd PlanGen

conda create -n plangen python=3.10
conda activate plangen
pip install -r requirements.txt

2. Prepare the models

git lfs install

# PlanGen checkpoint
git clone https://huggingface.co/qihoo360/PlanGen models/PlanGen

3. Prepare the training data

Please refer to CreatiLayout to download the LayoutSAM dataset.

Please refer to HiCo to prepare the HiCo dataset.

Please refer to OpenImage v6 to download the OpenImage dataset, and use MiniCPM to caption the images. We store the OpenImage image captions we annotate in PlanGen_oim_caps.

4. Multi-task Inference on LayoutSAM-eval dataset

Change the layoutsam_eval in file project/plangen/cfg/base.py to the dirname where it was downloaded, or if your machine is connected to the Internet, it will download automatically.

# layout2image generation
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='creati' test_data.task_type='uni'

# layout-image joint generation
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='creati' test_data.task_type='uni_2stage'

# image layout understanding
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='creati' test_data.task_type='mmu'

5. Object Removal and Image Editing on custom coco subset

In order to perform object removal and image editing, you need to download our proprocessed 200 samples based on COCO from PlanGen_coco_data.

wget https://huggingface.co/datasets/qihoo360/PlanGen_data/blob/main/coco_data.zip
unzip coco_data.zip

Change the coco_200_path in file project/plangen/cfg/base.py to ./coco_data.

# object removal
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='coco' use_teacher_forcing=True pad_edit_box=0.1 use_neg_box=True trans_data_to_rm=True ## rm

# layout-guided image editing
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='edit_coco' use_teacher_forcing=True pad_edit_box=0.1 use_neg_box=False ## edit

🔥 Train

python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py

The default training data includes LayoutSAM, HiCo, OpenImage and LayoutGPT, which you can modify in the configuration file as needed.

If you need to use layoutGPT data in your training, do the following:

cd three_party
git clone https://github.com/weixi-feng/LayoutGPT
cd ..

BibTeX

@misc@misc{he2025plangen,
      title={PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models}, 
      author={Runze He and Bo Cheng and Yuhang Ma and Qingxiang Jia and Shanyuan Liu and Ao Ma and Xiaoyu Wu and Liebucha Wu and Dawei Leng and Yuhui Yin},
      year={2025},
      eprint={2503.10127},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.10127}, 
}

License

This project is licensed under the Apache License (Version 2.0).

About

Unified layout planning and image generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages