Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Improve Prometheus Metrics #1338

Merged
merged 15 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 15 additions & 7 deletions docs/src/pages/guides/cli-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,13 +253,21 @@ Then you can scrape the metrics from `http://localhost:3001/metrics`.

Monika exposes [Prometheus default metrics](https://prometheus.io/docs/instrumenting/writing_clientlibs/#standard-and-runtime-collectors), [Node.js specific metrics](https://github.com/siimon/prom-client/tree/master/lib/metrics), and Monika probe metrics below.

| Metric Name | Type | Purpose | Label |
| -------------------------------------- | --------- | -------------------------------------------- | ------------------------------------------- |
| `monika_probes_total` | Gauge | Collect total probe | - |
| `monika_request_status_code_info` | Gauge | Collect HTTP status code | `id`, `name`, `url`, `method` |
| `monika_request_response_time_seconds` | Histogram | Collect duration of probe request in seconds | `id`, `name`, `url`, `method`, `statusCode` |
| `monika_request_response_size_bytes` | Gauge | Collect size of response size in bytes | `id`, `name`, `url`, `method`, `statusCode` |
| `monika_alert_total` | Counter | Collect total alert triggered | `id`, `name`, `url`, `method`, `alertQuery` |
| Metric Name | Type | Purpose | Labels |
| -------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- |
| `monika_alerts_triggered` | Counter | Indicates the count of incident alerts triggered | `id`, `name`, `url`, `method`, `alertQuery` |
| `monika_alerts_triggered_total` | Counter | Indicates the cumulative count of incident alerts triggered | - |
| `monika_probes_running` | Gauge | Indicates whether a probe is running (1) or idle (0). Running means the probe is currently sending requests, while idle means the probe is waiting for the next request to be sent. |
| `monika_probes_running_total` | Gauge | Indicates the total count of probes that are currently running. Running means the probe is currently sending requests. | - |
| `monika_probes_status` | Gauge | Indicates whether a probe is healthy (1) or is having an incident (0) | `id`, `name`, `url`, `method` |
| `monika_probes_total` | Gauge | Total count of all probes configured | - |
| `monika_request_response_size_bytes` | Gauge | Indicates the size of probe request's response in bytes | `id`, `name`, `url`, `method`, `statusCode`, `result` |
| `monika_request_response_time_seconds` | Histogram | Indicates the duration of the probe request in seconds | `id`, `name`, `url`, `method`, `statusCode`, `result` |
| `monika_request_status_code_info` | Gauge | Indicates the HTTP status code of the probe requests' response(s) | `id`, `name`, `url`, `method` |
| `monika_notifications_triggered` | Counter | Indicates the count of notifications triggered | `type`, `status` |
| `monika_notifications_triggered_total` | Counter | Indicates the cumulative count of notifications triggered | - |

Aside from the above metrics, Monika also exposes [Prometheus default metrics](https://prometheus.io/docs/instrumenting/writing_clientlibs/#standard-and-runtime-collectors) and [Node.js specific metrics](https://github.com/siimon/prom-client/tree/master/lib/metrics)

## Repeat

Expand Down
16 changes: 10 additions & 6 deletions packages/notification/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,29 +34,33 @@ async function sendNotifications(
notifications: Notification[],
message: NotificationMessage,
sender?: InputSender
): Promise<void> {
): Promise<{ type: string; success: boolean }[]> {
if (sender) {
updateSender(sender)
}

await Promise.all(
// Map notifications to an array of results
const results = await Promise.all(
notifications.map(async ({ data, type }) => {
const channel = channels[type]

try {
if (!channel) {
throw new Error('Notification channel is not available')
}

await channel.send(data, message)
return { type, success: true }
} catch (error: unknown) {
const message = getErrorMessage(error)
throw new Error(
`Failed to send message using ${type}, please check your ${type} notification config.\nMessage: ${message}`
const errorMessage = getErrorMessage(error)
console.error(
`Failed to send message using ${type}, please check your ${type} notification config.\nMessage: ${errorMessage}`
)
return { type, success: false }
}
})
)

return results
}

export { sendNotifications }
Expand Down
10 changes: 9 additions & 1 deletion src/components/notification/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,13 @@
* SOFTWARE. *
**********************************************************************************/

import { getEventEmitter } from '../../utils/events'
import { ValidatedResponse } from '../../plugins/validate-response'
import getIp from '../../utils/ip'
import { getMessageForAlert } from './alert-message'
import { sendNotifications } from '@hyperjumptech/monika-notification'
import type { Notification } from '@hyperjumptech/monika-notification'
import events from '../../events'

type SendAlertsProps = {
probeID: string
Expand Down Expand Up @@ -54,5 +56,11 @@ export async function sendAlerts({
response: validation.response,
})

return sendNotifications(notifications, message)
const results = await sendNotifications(notifications, message)
for (const result of results) {
getEventEmitter().emit(events.notifications.sent, {
type: result.type,
status: result.success ? 'success' : 'failed',
})
}
}
14 changes: 13 additions & 1 deletion src/components/probe/prober/http/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@ export class HTTPProber extends BaseProber {
response,
})

getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex,
status: 'up',
})

this.logMessage(
true,
getProbeResultMessage({
Expand Down Expand Up @@ -226,10 +232,16 @@ export class HTTPProber extends BaseProber {
}
const alertId = getAlertID(url, validation, probeID)

getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex,
status: 'down',
})

getEventEmitter().emit(events.probe.alert.triggered, {
probe: this.probeConfig,
requestIndex,
alertQuery: '',
alertQuery: triggeredAlert,
})

addIncident({
Expand Down
6 changes: 6 additions & 0 deletions src/components/probe/prober/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ export abstract class BaseProber implements Prober {

// this probe is definitely in incident state because of fail assertion, so send notification, etc.
this.handleFailedProbe(probeResults)

return
}

Expand All @@ -148,6 +149,11 @@ export abstract class BaseProber implements Prober {
requestIndex: index,
response: requestResponse,
})
getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex: index,
status: 'up',
})
logResponseTime(requestResponse.responseTime)

if (
Expand Down
6 changes: 6 additions & 0 deletions src/events/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ export default {
sanitized: 'CONFIG_SANITIZED',
updated: 'CONFIG_UPDATED',
},
notifications: {
sent: 'NOTIFICATIONS_SENT',
},
probe: {
alert: {
triggered: 'PROBE_ALERT_TRIGGERED',
Expand All @@ -46,5 +49,8 @@ export default {
notification: {
willSend: 'PROBE_NOTIFICATION_WILL_SEND',
},
status: {
changed: 'PROBE_STATUS_CHANGED',
},
},
}
4 changes: 4 additions & 0 deletions src/loaders/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@ function initPrometheus(prometheusPort: number) {
decrementProbeRunningTotal,
incrementProbeRunningTotal,
resetProbeRunningTotal,
collectProbeStatus,
collectNotificationSentMetrics,
} = new PrometheusCollector()

// collect prometheus metrics
Expand All @@ -93,6 +95,8 @@ function initPrometheus(prometheusPort: number) {
eventEmitter.on(events.probe.ran, incrementProbeRunningTotal)
eventEmitter.on(events.probe.finished, decrementProbeRunningTotal)
eventEmitter.on(events.config.updated, resetProbeRunningTotal)
eventEmitter.on(events.probe.status.changed, collectProbeStatus)
eventEmitter.on(events.notifications.sent, collectNotificationSentMetrics)

startPrometheusMetricsServer(prometheusPort)
}
Loading
Loading