Scaling Safe Multi-Agent Control for Signal Temporal Logic Specifications [CoRL 2024]


Official Implementation of Paper: Joe Eappen, Zikang Xiong, Dipam Patel, Aniket Bera, Suresh Jagannathan: "Scaling Safe Multi-Agent Control for Signal Temporal Logic Specifications".

Main File Structure

├── requirements.txt                    # requirements file installed via pip
├──                            # training script
├──                             # testing script
├── plot.ipynb                          # plotting helper notebook
├── *.sh                                # bash scripts for running experiments
├── tests                               # (dir) simple testing scripts to load environment
├── pretrained                          # (dir) pre-trained models (saved GCBF+ model used in the paper)
└── gcbfplus                            # (dir) GCBF+ code from MIT-REALM
    ├── algo                            # (dir) GCBF+ and GCBF code
    │   ├── module                      # (dir) High-level network modules
    │   │   ├── planner                 # (dir) Planner modules    
    │   │   └── ...  
    │   ├──           # Planner with GCBF+ controller algorithm
    │   └── ...                         
    ├── env                             # (dir) Environment code
    │   ├── wrapper                     # (dir) Environment wrappers with STL interface
    │   │   ├──              # Environment wrapper with STL interface
    │   │   ├──           # Asynchronous goal wrapper (for asynchronous goal change during deployment)
    │   │   └── ...
    │   └── ...
    ├── nn                              # (dir) Core Neural network modules  
    ├── stl                             # (dir) Signal Temporal Logic (STL) utilities
    ├── trainer                         # (dir) Training utilities
    └── utils                           # (dir) Utility functions
        ├── configs/default_config.yaml # Default configuration file
        └── ...


We recommend to use CONDA to install the requirements:

conda create -n mastl-gcbf python=3.10
conda activate mastl-gcbf
cd mastl-gcbf

Then install jax following the official instructions, and the CPU version of pytorch (for easy compatibility with the diff-spec package requirements without messing up the jax installation):

pip3 install torch --index-url

and then install the rest of the dependencies:

pip install -r requirements.txt


Install the package by running:

pip install -e .



We use the 2D environments from the original GCBF+ [1] paper SingleIntegrator, DoubleIntegrator, and DubinsCar.


We provide planners including STLPY MILP planner [2] (stlpy), GNN-ODE planner (gnn-ode), and ODE planner (ode) without the GNN component. Use --planner to specify the planner.

STL Specifications

Experiment with different STL specifications by changing the --spec flag. We provide the following STL specifications:

  • coverN: Cover N regions, e.g., cover3 covers 3 regions
  • seqN: Sequence of N regions, e.g., seq3 sequentially visits 3 regions
  • MbranchN: M-branch with N regions, e.g., 2branch3 has 2 branches with 3 regions each
  • MloopN: Loop M times over N regions, e.g., 2loop3 has 2 loops with 3 regions each


For the STLPY MILP [2] controller, use the vanilla GCBF+ controller (gcbf+) which does not need to be trained, and for the GNN-ODE planner, use the pretrained GCBF+ controller with the learnable planner (plangcbf+). Use --algo to specify the controller.


To reproduce the results shown in our paper, one can refer to settings.yaml.


To train the planner (for the plangcbf+ setting with a GNN-ODE or ODE planner) given the pretrained GCBF+ controller use:

python --algo plangcbf+ --env DubinsCar -n 8 --area-size 4 --n-env-train 8 --n-env-test 8  --load-dir ./pretrained/DubinsCar/gcbf+/models/ --spec cover3 --spec-len 15 --lr-planner 1e-5 --planner gnn-ode --goal-sample-interval 30 --loss-real-stl-coef 0.5 --loss-plan-stl-coef 0.5 --steps 2500 --loss-achievable-coef 10 

In our paper, we use 8 agents with 1000 training steps. The training logs will be saved in folder ./logs/<env>/<algo>/seed<seed>_<training-start-time>. We also provide the following flags:

  • -n: number of agents
  • --env: environment, including SingleIntegrator, DoubleIntegrator, DubinsCar, LinearDrone, and CrazyFlie
  • --algo: algorithm, including gcbf, gcbf+
  • --seed: random seed
  • --steps: number of training steps
  • --name: name of the experiment
  • --debug: debug mode: no recording, no saving, and no JIT
  • --obs: number of obstacles
  • --n-rays: number of LiDAR rays
  • --area-size: side length of the environment
  • --n-env-train: number of environments for training
  • --n-env-test: number of environments for testing
  • --log-dir: path to save the training logs
  • --eval-interval: interval of evaluation
  • --eval-epi: number of episodes for evaluation
  • --save-interval: interval of saving the model
  • --goal-sample-interval: interval of sampling new goals
  • --spec-len: length of the STL specification (number of waypoints from the planner)
  • --spec: STL specification
  • --lr-planner: learning rate of the planner
  • --planner: planner, including gnn-ode, and ode'

In addition to the hyper parameters of GCBF+, we use the following flags to specify the hyper-parameters:

  • --lr-planner: learning rate of the planner
  • --loss-plan-stl-coef: coefficient of the planned path STL loss
  • --loss-achievable-coef: coefficient of the achievable STL loss (difference between the planned path and the real path)
  • --loss-real-stl-coef: (optional) coefficient of the real path STL loss (try differentiating through the environment)
  • --buffer-size: size of the replay buffer


To test the learned planner with the spec trained upon, where log_path is a path to the log folder (e.g. logs/DubinsCar/plangcbf+/seed0_20240811003419/), use:

python --path <log_path> --epi 1 --area-size 4 -n 2 --obs 0 --nojit-rollout --goal-sample-interval 20 --log --async-planner --ignore-on-finish

To use the MILP planner, use --planner stlpy as below using the pre-trained GCBF+ controller:

python --path pretrained/DubinsCar/gcbf+/ --epi 1 --area-size 4 -n 2 --obs 0 --nojit-rollout --planner stlpy --spec-len 15 --goal-sample-interval 20 --spec cover3 --log --async-planner --ignore-on-finish

This should report the safety rate, goal reaching rate, and success rate of the learned model, and generate videos of the learned model in <path-to-log>/videos. Use the following flags to customize the test:

  • -n: number of agents
  • --obs: number of obstacles
  • --area-size: side length of the environment
  • --max-step: maximum number of steps for each episode, increase this if you have a large environment
  • --path: path to the log folder
  • --n-rays: number of LiDAR rays
  • --alpha: CBF alpha, used in centralized CBF-QP and decentralized CBF-QP
  • --max-travel: maximum travel distance of agents
  • --cbf: plot the CBF contour of this agent, only support 2D environments
  • --seed: random seed
  • --debug: debug mode
  • --cpu: use CPU
  • --env: test environment (not needed if the log folder is specified using --path)
  • --algo: test algorithm (not needed if the log folder is specified using --path)
  • --step: test step (not needed if testing the last saved model)
  • --epi: number of episodes to test
  • --offset: offset of the random seeds
  • --no-video: do not generate videos
  • --log: log the results to a file
  • --nojit-rollout: do not use jit to speed up the rollout, used for large-scale tests
  • --async-planner: asynchronous goal change during deployment (since it is hard to synchronize an unknown number of agents)
  • --ignore-on-finish: ignore collisions after reaching the goal (assume agent vanishes/lands)
  • --planner: (for stlpy) test planner (not needed if the log folder is specified using --path)
  • --spec-len: (for stlpy) length of the STL specification (number of waypoints from the planner)
  • --spec: (for stlpy) STL specification

Pre-trained models

We provide the pre-trained GCBF+ controller from [1] in the folder pretrained.


This uses an underlying GCBF+ [1] controller, and we thank the authors for their excellent implementation upon which we added planning capabilities.


[1] GCBF+: A Neural Graph Control Barrier Function Framework for Distributed Safe Multi-Agent Control, Zhang, S. et al.

[2] Mixed-Integer Programming for Signal Temporal Logic with Fewer Binary Variables, Kurtz, Vincent, & Lin, Hai


