[EASY] Delay DB metrics until after restart is done (#2581)

# Description Restarting the autopilot (especially trying out a new config) can be risky because it can be extremely slow on startup. If you have to revert your temporary change because something it broken it can basically be a mini-outage. # Changes Don't immediately update DB metrics (table sizes) but rather start by waiting. This effectively delays updating the DB metrics until **after** the most important work is done after a restart (i.e. building the new auction / filling native price cache). ## How to test I built a hacky benchmark that simply starts the autopilot connected to the read replica of the prod mainnet DB (to get the most realistic data). That showed that the query to fetch open orders takes ~10s - 20s when we immediately start updating the DB metrics in parallel but only ~2s-3s when we delay updating those metrics.
cowprotocol · Apr 2, 2024 · 11e356c · 11e356c
1 parent 31b7641
commit 11e356c
Showing 1 changed file with 4 additions and 1 deletion.
diff --git a/crates/autopilot/src/database.rs b/crates/autopilot/src/database.rs
@@ -139,10 +139,13 @@ pub fn run_database_metrics_work(db: Postgres) {
 
 async fn database_metrics(db: Postgres) -> ! {
     loop {
+        // The DB gets used a lot right after starting the system.
+        // Since these queries are quite expensive we delay them
+        // to improve the startup time of the system.
+        tokio::time::sleep(Duration::from_secs(60)).await;
         if let Err(err) = db.update_database_metrics().await {
             tracing::error!(?err, "failed to update table rows metric");
         }
-        tokio::time::sleep(Duration::from_secs(60)).await;
     }
 }