This repository contains pytorch implementation for solving VRP using Reinforcement Learning[1]
- Advantage = total distance of predicted route - critic value
- Actor = advantage* log probability
- Critic = advantage^2
CNN Embedding
Single LSTM Layer
- 2 Dense Layers
- SGD
- learning rate 1.0
- L2 Gradient Clipping 2.0
- Batch Size 128
- OS: macOS 14.0 23A344 arm64
- Host: MacBookPro17,1
- CPU: Apple M1
- GPU: Apple M1
- Memory: 16384MiB
[1] Nazari, M. et al.: Reinforcement Learning for Solving the Vehicle Routing Problem.