👉 PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models
Runze He, Bo Cheng, Yuhang Ma, Qingxiang Jia, Shanyuan Liu, Ao Ma, Xiaoyu Wu, Liebucha Wu, Dawei Leng†, Yuhui Yin(✝Corresponding Author)
- [2025/3/14] We initialized this github repository and released the code.
- [2025/3/14] We released the paper of PlanGen.
git clone https://github.com/360CVGroup/PlanGen.git
cd PlanGen
conda create -n plangen python=3.10
conda activate plangen
pip install -r requirements.txt
git lfs install
# PlanGen checkpoint
git clone https://huggingface.co/qihoo360/PlanGen models/PlanGen
Please refer to CreatiLayout to download the LayoutSAM dataset.
Please refer to HiCo to prepare the HiCo dataset.
Please refer to OpenImage v6 to download the OpenImage dataset, and use MiniCPM to caption the images. We store the OpenImage image captions we annotate in PlanGen_oim_caps.
Change the layoutsam_eval
in file project/plangen/cfg/base.py
to the dirname where it was downloaded, or if your machine is connected to the Internet, it will download automatically.
# layout2image generation
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='creati' test_data.task_type='uni'
# layout-image joint generation
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='creati' test_data.task_type='uni_2stage'
# image layout understanding
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='creati' test_data.task_type='mmu'
In order to perform object removal and image editing, you need to download our proprocessed 200 samples based on COCO from PlanGen_coco_data.
wget https://huggingface.co/datasets/qihoo360/PlanGen_data/blob/main/coco_data.zip
unzip coco_data.zip
Change the coco_200_path
in file project/plangen/cfg/base.py
to ./coco_data
.
# object removal
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='coco' use_teacher_forcing=True pad_edit_box=0.1 use_neg_box=True trans_data_to_rm=True ## rm
# layout-guided image editing
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py --opt test=True resume=models/PlanGen/checkpoint-200000 test_data.data_name='edit_coco' use_teacher_forcing=True pad_edit_box=0.1 use_neg_box=False ## edit
python train.py --cfg project/plangen/cfg/uni/h_text_ump+oimsam.py
The default training data includes LayoutSAM, HiCo, OpenImage and LayoutGPT, which you can modify in the configuration file as needed.
If you need to use layoutGPT data in your training, do the following:
cd three_party
git clone https://github.com/weixi-feng/LayoutGPT
cd ..
@misc@misc{he2025plangen,
title={PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models},
author={Runze He and Bo Cheng and Yuhang Ma and Qingxiang Jia and Shanyuan Liu and Ao Ma and Xiaoyu Wu and Liebucha Wu and Dawei Leng and Yuhui Yin},
year={2025},
eprint={2503.10127},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.10127},
}
This project is licensed under the Apache License (Version 2.0).