Official Implementation of Paper: Joe Eappen, Zikang Xiong, Dipam Patel, Aniket Bera, Suresh Jagannathan: "Scaling Safe Multi-Agent Control for Signal Temporal Logic Specifications".
├── requirements.txt # requirements file installed via pip
├── train.py # training script
├── test.py # testing script
├── plot.ipynb # plotting helper notebook
├── *.sh # bash scripts for running experiments
├── tests # (dir) simple testing scripts to load environment
├── pretrained # (dir) pre-trained models (saved GCBF+ model used in the paper)
└── gcbfplus # (dir) GCBF+ code from MIT-REALM
├── algo # (dir) GCBF+ and GCBF code
│ ├── module # (dir) High-level network modules
│ │ ├── planner # (dir) Planner modules
│ │ └── ...
│ ├── plan_gcbf_plus.py # Planner with GCBF+ controller algorithm
│ └── ...
├── env # (dir) Environment code
│ ├── wrapper # (dir) Environment wrappers with STL interface
│ │ ├── wrapper.py # Environment wrapper with STL interface
│ │ ├── async_goal.py # Asynchronous goal wrapper (for asynchronous goal change during deployment)
│ │ └── ...
│ └── ...
├── nn # (dir) Core Neural network modules
├── stl # (dir) Signal Temporal Logic (STL) utilities
├── trainer # (dir) Training utilities
└── utils # (dir) Utility functions
├── configs/default_config.yaml # Default configuration file
└── ...
We recommend to use CONDA to install the requirements:
conda create -n mastl-gcbf python=3.10
conda activate mastl-gcbf
cd mastl-gcbf
Then install jax following the official instructions, and the CPU version of pytorch (for easy compatibility with the diff-spec package requirements without messing up the jax installation):
pip3 install torch --index-url https://download.pytorch.org/whl/cpu
and then install the rest of the dependencies:
pip install -r requirements.txt
Install the package by running:
pip install -e .
We use the 2D environments from the original GCBF+ [1] paper SingleIntegrator
, DoubleIntegrator
, and DubinsCar
.
We provide planners including STLPY MILP planner [2] (stlpy
), GNN-ODE planner (gnn-ode
), and ODE planner (ode
) without the GNN component. Use --planner
to specify the planner.
Experiment with different STL specifications by changing the --spec
flag. We provide the following STL specifications:
coverN
: Cover N regions, e.g.,cover3
covers 3 regionsseqN
: Sequence of N regions, e.g.,seq3
sequentially visits 3 regionsMbranchN
: M-branch with N regions, e.g.,2branch3
has 2 branches with 3 regions eachMloopN
: Loop M times over N regions, e.g.,2loop3
has 2 loops with 3 regions each
For the STLPY MILP [2] controller, use the vanilla GCBF+ controller (gcbf+
) which does not need to be trained, and for the GNN-ODE planner, use the pretrained GCBF+ controller with the learnable planner (plangcbf+
). Use --algo
to specify the controller.
To reproduce the results shown in our paper, one can refer to settings.yaml
.
To train the planner (for the plangcbf+
setting with a GNN-ODE
or ODE
planner) given the pretrained GCBF+ controller use:
python train.py --algo plangcbf+ --env DubinsCar -n 8 --area-size 4 --n-env-train 8 --n-env-test 8 --load-dir ./pretrained/DubinsCar/gcbf+/models/ --spec cover3 --spec-len 15 --lr-planner 1e-5 --planner gnn-ode --goal-sample-interval 30 --loss-real-stl-coef 0.5 --loss-plan-stl-coef 0.5 --steps 2500 --loss-achievable-coef 10
In our paper, we use 8 agents with 1000 training steps. The training logs will be saved in folder ./logs/<env>/<algo>/seed<seed>_<training-start-time>
. We also provide the following flags:
-n
: number of agents--env
: environment, includingSingleIntegrator
,DoubleIntegrator
,DubinsCar
,LinearDrone
, andCrazyFlie
--algo
: algorithm, includinggcbf
,gcbf+
--seed
: random seed--steps
: number of training steps--name
: name of the experiment--debug
: debug mode: no recording, no saving, and no JIT--obs
: number of obstacles--n-rays
: number of LiDAR rays--area-size
: side length of the environment--n-env-train
: number of environments for training--n-env-test
: number of environments for testing--log-dir
: path to save the training logs--eval-interval
: interval of evaluation--eval-epi
: number of episodes for evaluation--save-interval
: interval of saving the model--goal-sample-interval
: interval of sampling new goals--spec-len
: length of the STL specification (number of waypoints from the planner)--spec
: STL specification--lr-planner
: learning rate of the planner--planner
: planner, includinggnn-ode
, andode
'
In addition to the hyper parameters of GCBF+, we use the following flags to specify the hyper-parameters:
--lr-planner
: learning rate of the planner--loss-plan-stl-coef
: coefficient of the planned path STL loss--loss-achievable-coef
: coefficient of the achievable STL loss (difference between the planned path and the real path)--loss-real-stl-coef
: (optional) coefficient of the real path STL loss (try differentiating through the environment)--buffer-size
: size of the replay buffer
To test the learned planner with the spec trained upon, where log_path
is a path to the log folder (e.g. logs/DubinsCar/plangcbf+/seed0_20240811003419/
), use:
python test.py --path <log_path> --epi 1 --area-size 4 -n 2 --obs 0 --nojit-rollout --goal-sample-interval 20 --log --async-planner --ignore-on-finish
To use the MILP planner, use --planner stlpy
as below using the pre-trained GCBF+ controller:
python test.py --path pretrained/DubinsCar/gcbf+/ --epi 1 --area-size 4 -n 2 --obs 0 --nojit-rollout --planner stlpy --spec-len 15 --goal-sample-interval 20 --spec cover3 --log --async-planner --ignore-on-finish
This should report the safety rate, goal reaching rate, and success rate of the learned model, and generate videos of the learned model in <path-to-log>/videos
. Use the following flags to customize the test:
-n
: number of agents--obs
: number of obstacles--area-size
: side length of the environment--max-step
: maximum number of steps for each episode, increase this if you have a large environment--path
: path to the log folder--n-rays
: number of LiDAR rays--alpha
: CBF alpha, used in centralized CBF-QP and decentralized CBF-QP--max-travel
: maximum travel distance of agents--cbf
: plot the CBF contour of this agent, only support 2D environments--seed
: random seed--debug
: debug mode--cpu
: use CPU--env
: test environment (not needed if the log folder is specified using--path
)--algo
: test algorithm (not needed if the log folder is specified using--path
)--step
: test step (not needed if testing the last saved model)--epi
: number of episodes to test--offset
: offset of the random seeds--no-video
: do not generate videos--log
: log the results to a file--nojit-rollout
: do not use jit to speed up the rollout, used for large-scale tests--async-planner
: asynchronous goal change during deployment (since it is hard to synchronize an unknown number of agents)--ignore-on-finish
: ignore collisions after reaching the goal (assume agent vanishes/lands)--planner
: (for stlpy) test planner (not needed if the log folder is specified using--path
)--spec-len
: (for stlpy) length of the STL specification (number of waypoints from the planner)--spec
: (for stlpy) STL specification
We provide the pre-trained GCBF+ controller from [1] in the folder pretrained
.
This uses an underlying GCBF+ [1] controller, and we thank the authors for their excellent implementation upon which we added planning capabilities.
[1] GCBF+: A Neural Graph Control Barrier Function Framework for Distributed Safe Multi-Agent Control, Zhang, S. et al.
[2] Mixed-Integer Programming for Signal Temporal Logic with Fewer Binary Variables, Kurtz, Vincent, & Lin, Hai