Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and configure custom web status checker [API-417] #90

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 139 additions & 0 deletions app/Console/Commands/WebStatusCheck.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
<?php

namespace App\Console\Commands;

use Illuminate\Console\Command;
use Illuminate\Support\Facades\Http;
use Illuminate\Support\Facades\Log;
use Carbon\Carbon;
use Carbon\CarbonInterface;

class WebStatusCheck extends Command
{
protected $signature = 'monitor:website {domains*}';
protected $description = 'Monitor specified domains for non-200 responses';

private const INITIAL_WAIT = 60;
private const ALERT_INTERVALS = [300, 120, 60];
private const MAX_ALERTS = 7;

private $domainStates = [];

public function handle()
{
$domains = $this->argument('domains');

// Setup states for domains
foreach ($domains as $domain) {
$this->domainStates[$domain] = [
'firstAlertTime' => null,
'alertCount' => 0,
'lastAlertTime' => null,
'currentWaitIndex' => 0,
];
}

$this->info('Starting monitoring of domains: ' . implode(', ', $domains));

while (true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we expect this command always to be running? while (true) worries me. I wonder if, instead, we can think of this command as:

  1. Check all the domains in the list once, and log the status somewhere
  2. Go through the domain statuses and decide if any action needs to be taken on any of them

Then we can think of the code in this script as a single iteration, and have the Kernel below set to run the command every minute.

foreach ($domains as $domain) {
try {
$response = Http::get("https://{$domain}");

if ($response->status() !== 200) {
$this->handleNon200Response($domain, $response->status());
} elseif ($this->domainStates[$domain]['alertCount'] > 0) {
// Site is back up after previous alerts
$this->sendSlackMessage("{$domain} is back up :blob_excited:");
$this->resetMonitoring($domain);
}

// If we haven't hit max alerts wait for next check
if (
$this->domainStates[$domain]['alertCount'] < self::MAX_ALERTS ||
$this->domainStates[$domain]['alertCount'] === self::MAX_ALERTS
) {
$waitTime = $this->getWaitTime($domain);
sleep($waitTime);
}
} catch (\Exception $e) {
$this->info("Error monitoring {$domain}: " . $e->getMessage());
Log::error("Monitoring error for {$domain}: " . $e->getMessage());
Comment on lines +60 to +61
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just use $this->error() here?

sleep(60);
}
}
}
}

private function handleNon200Response(string $domain, int $responseCode)
{
$currentTime = Carbon::now();
$state = &$this->domainStates[$domain];

if ($state['alertCount'] === 0) {
// First alert
$state['firstAlertTime'] = $currentTime;
$state['lastAlertTime'] = $currentTime;
$state['alertCount']++;
$this->sendSlackMessage(
"{$domain} responded {$responseCode} :eyes:"
);
sleep(self::INITIAL_WAIT);
return;
}

// Only send alerts if we haven't reached the maximum
if ($state['alertCount'] < self::MAX_ALERTS) {
$downtime = $currentTime->diffForHumans($state['firstAlertTime'], [
'syntax' => CarbonInterface::DIFF_ABSOLUTE,
]);
$this->sendSlackMessage(
"{$domain} responded {$responseCode} for {$downtime} :ahhhhhhhhh:"
);
$state['alertCount']++;
$state['lastAlertTime'] = $currentTime;

// Move to next wait interval
if ($state['currentWaitIndex'] < count(self::ALERT_INTERVALS) - 1) {
$state['currentWaitIndex']++;
}
}
}

private function getWaitTime(string $domain): int
{
$state = $this->domainStates[$domain];

if ($state['alertCount'] === 0) {
return 60; // Default check interval when everything is normal
}

if ($state['alertCount'] === self::MAX_ALERTS) {
return 60; // After max alerts just keep checking every minute
}

return self::ALERT_INTERVALS[$state['currentWaitIndex']];
}

private function sendSlackMessage(string $message)
{
$webhookUrl = config('aic.monitoring.slack.webhook_url');
try {
Http::post($webhookUrl, [
'text' => $message
]);
} catch (\Exception $e) {
Log::error('Failed to send Slack message: ' . $e->getMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider $this->error() for uniformity.

}
}

private function resetMonitoring(string $domain)
{
$this->domainStates[$domain] = [
'firstAlertTime' => null,
'alertCount' => 0,
'lastAlertTime' => null,
'currentWaitIndex' => 0,
];
}
}
6 changes: 6 additions & 0 deletions app/Console/Kernel.php
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,12 @@ protected function schedule(Schedule $schedule): void
->sundays()
->withoutOverlapping(self::FOR_ONE_YEAR);
}

if (config('aic.monitoring.enabled')) {
$schedule->command('monitor:website ' . implode(' ', config('aic.monitoring.domains')))
->daily()
->withoutOverlapping(self::FOR_ONE_YEAR);
}
}

/**
Expand Down
8 changes: 8 additions & 0 deletions config/aic.php
Original file line number Diff line number Diff line change
Expand Up @@ -137,4 +137,12 @@
'product_url' => env('SHOP_PRODUCT_URL'),

],

'monitoring' => [
'enabled' => env('MONITORING_ENABLED'),
'slack' => [
'webhook_url' => env('SLACK_ALERT_WEBHOOK'),
],
'domains' => explode(',', env('MONITORED_DOMAINS', '')),
],
];
Loading