本文档描述当前kubegems实现的日志、监控、告警、事件、opentelemetry的可观测性方案。
监控插件 monitoring 部署了以下资源:
- kube-prometheus-stack: 核心组件,通过
prometheus-operator
的方式部署、管理prometheus、alertmanager、kube-state-metrics,同时还部署有apiserver、kubelet等服务的servicemonitor
- kubegems-default-monitor-amconfig:alertmanager全局配置,在kube-prometheus-stack中引用:
alertmanager: alertmanagerConfiguration: name: kubegems-default-monitor-alert-rule templates: - configMap: key: kubegems-email.tmpl name: kubegems-email-template
- kubegems-email-template: 邮件告警模板,除了kube-prometheus-stack中引用之外,我们配置alertmanagerconfig时,也需要在receiver中引用:
apiVersion: monitoring.coreos.com/v1alpha1 kind: AlertmanagerConfig ... spec: receivers: - emailConfigs: - headers: - key: subject value: Kubegems alert [{{ .CommonLabels.gems_alertname }}:{{ .Alerts.Firing | len }}] in [cluster:{{ .CommonLabels.cluster }}] [namespace:{{ .CommonLabels.gems_namespace }}] html: '{{ template "email.common.html" . }}' ...
- kubegems-default-record-rule: 预置的recording rules
- alertproxy: 告警代理组件,用以代理飞书、钉钉、阿里云短信语音告警,详见https://github.com/kubegems/alertproxy
使用文档见https://www.kubegems.io/docs/category/%E7%9B%91%E6%8E%A7%E4%B8%AD%E5%BF%83
我们提供两种监控模板:
- promql模板
用于便捷查询prometheus,系统预置的模板在:https://github.com/kubegems/kubegems/blob/main/config/promql_tpl.yaml,在数据库初始化时插入
模板分为三级结构:
scope
、resource
、rule
, - dashboard模板 用于便捷创建监控面板,系统预置的模板在:https://github.com/kubegems/kubegems/tree/main/config/dashboards,在数据库初始化时插入
另:可以使用脚本https://github.com/kubegems/kubegems/blob/main/scripts/export-monitor-tpls/main.go一键地将目标集群的模板数据导出,eg:
go run scripts/export-monitor-tpls/main.go --context mycluster
- system-alert
作为模板文件,用于在每个集群创建系统预置告警规则,由
kubegems-worker
创建、管理系统告警,目前是每天同步一次
func (s *AlertRuleSyncTasker) Crontasks() map[string]Task {
return map[string]Task{
// 同步告警规则状态
"@every 5m": {
Name: "sync alertrule state",
Group: "alertrule",
Steps: []workflow.Step{{Function: TaskFunction_SyncAlertRuleState}},
},
// 检查告警规则配置
"@every 12h": {
Name: "check alertrule config",
Group: "alertrule",
Steps: []workflow.Step{{Function: TaskFunction_CheckAlertRuleConfig}},
},
// 同步系统告警
"@daily": {
Name: "sync system alertrule",
Group: "alertrule",
Steps: []workflow.Step{{Function: TaskFunction_SyncSystemAlertRule}},
},
}
}
kubegems界面上的监控采集器管理,实际是操作servicemonitor
资源,用户可以在这里自定义采集。
而接入中心的中间件接入,我们是把servicemonitor内置到该exporter的helm charts中了,详见https://github.com/kubegems/appstore-charts
监控告警由CRD:prometheusrule
与alertmanagerconfig
实现,但是由于Kubegems是跨集群的,而且为了我们更便捷地查询、管理告警规则,我们在操作CRD之前,将配置的告警规则保存至mysql数据库,然后再做同步。
创建、更新、删除都按此逻辑,通过数据库事务保证数据一致性
// 创建
func (p *AlertRuleProcessor) CreateAlertRule(ctx context.Context, req *models.AlertRule) error {
return p.DBWithCtx(ctx).Transaction(func(tx *gorm.DB) error {
allRules := []models.AlertRule{}
if err := tx.Find(&allRules, "cluster = ? and namespace = ? and name = ?", req.Cluster, req.Namespace, req.Name).Error; err != nil {
return err
}
if len(allRules) > 0 {
return errors.Errorf("alert rule %s is already exist", req.Name)
}
for _, rec := range req.Receivers {
if rec.ID > 0 {
return errors.Errorf("receiver's id should be null when create")
}
}
if err := tx.Omit("Receivers.AlertChannel").Create(req).Error; err != nil {
return err
}
return p.SyncAlertRule(ctx, req)
})
}
// 更新
func (p *AlertRuleProcessor) UpdateAlertRule(ctx context.Context, req *models.AlertRule) error {
return p.DBWithCtx(ctx).Transaction(func(tx *gorm.DB) error {
if err := updateReceiversInDB(req, tx); err != nil {
return errors.Wrap(err, "update receivers")
}
if err := tx.Select("expr", "for", "message", "inhibit_labels", "alert_levels", "promql_generator", "logql_generator").
Updates(req).Error; err != nil {
return err
}
return p.SyncAlertRule(ctx, req)
})
}
// 删除
if err := h.withAlertRuleProcessor(c.Request.Context(), c.Param("cluster"), func(ctx context.Context, p *AlertRuleProcessor) error {
h.SetExtraAuditDataByClusterNamespace(c, req.Cluster, req.Namespace)
action := i18n.Sprintf(ctx, "delete")
module := i18n.Sprintf(ctx, "monitor alert rule")
h.SetAuditData(c, action, module, req.Name)
return p.DBWithCtx(ctx).Transaction(func(tx *gorm.DB) error {
if err := tx.First(req, "cluster = ? and namespace = ? and name = ?", req.Cluster, req.Namespace, req.Name).Error; err != nil {
return err
}
if err := tx.Delete(req).Error; err != nil {
return err
}
return p.deleteMonitorAlertRule(ctx, req)
})
}); err != nil {
handlers.NotOK(c, err)
return
}
另外,告警规则的禁用、启用、黑名单管理,都是通过配置管理alertmanager的silence
实现:
- 禁用告警:以
namespace
和alertname
为标签,为其创建一个silence规则 - 启用告警:将该silence删除
- 加入黑名单:以告警消息的所有标签创建一个silence
- 移除黑名单:删除silence
// 禁用告警
func createSilenceIfNotExist(ctx context.Context, namespace, alertName string, cli agents.Client) error {
silence, err := getSilence(ctx, namespace, alertName, cli)
if err != nil {
return err
}
// 不存在,创建
if silence == nil {
silence = &alerttypes.Silence{
Comment: fmt.Sprintf("silence for %s", alertName),
CreatedBy: alertName,
Matchers: labels.Matchers{
&labels.Matcher{
Type: labels.MatchEqual,
Name: prometheus.AlertNamespaceLabel,
Value: namespace,
},
&labels.Matcher{
Type: labels.MatchEqual,
Name: prometheus.AlertNameLabel,
Value: alertName,
},
},
StartsAt: time.Now(),
EndsAt: time.Now().AddDate(1000, 0, 0), // 100年
}
// create
return cli.DoRequest(ctx, agents.Request{
Method: http.MethodPost,
Path: "/custom/alertmanager/v1/silence/_/actions/create",
Body: silence,
})
}
return nil
}
告警渠道是租户级
的,我们管理他的时候,只操作mysql数据库,他在创建、更新告警规则时,才会被应用到对应的alertmanagerconfig的receiver中。
由于我们支持多种告警渠道,每种告警渠道要生成的receiver
又有不同,所以我们通过channel接口实现:
type ChannelIf interface {
// 转换为alertmanagerconfig的receiver对象
ToReceiver(name string) v1alpha1.Receiver
// 检查配置是否有误
Check() error
// 测试告警渠道是否能正常工作
Test(alert prometheus.WebhookAlert) error
String() string
}
如果我们要添加新的告警渠道,不仅要实现这个接口,还要在UnmarshalJSON
方法添加对应的case
// UnmarshalJSON to deserialize []byte
func (m *ChannelConfig) UnmarshalJSON(b []byte) error {
if string(b) == "null" {
m.ChannelIf = nil
return nil
}
tmp := struct {
ChannelType `json:"channelType"`
}{}
if err := json.Unmarshal(b, &tmp); err != nil {
return errors.Wrap(err, "unmarshal channelType")
}
switch tmp.ChannelType {
case TypeWebhook:
webhook := Webhook{}
if err := json.Unmarshal(b, &webhook); err != nil {
return errors.Wrap(err, "unmarshal webhook channel")
}
m.ChannelIf = &webhook
case TypeEmail:
email := Email{}
if err := json.Unmarshal(b, &email); err != nil {
return errors.Wrap(err, "unmarshal email channel")
}
m.ChannelIf = &email
...
default:
return fmt.Errorf("unknown channel type: %s", tmp.ChannelType)
}
return nil
}
日志插件 logging 部署了以下资源:
- logging-operator: banzaicloud的logging-operator,只部署了operator
- logging-operator-logging: 基于logging-operator的日志组件,包括fluentbit、fluentd,还创建了默认的
clusterOutputs
,即kubegems-container-console-output
- loki: loki+redis服务,注意这里配置了写入prometheus的remote write以及alertmanager告警地址
ruler: storage: type: local local: directory: /rules rule_path: /tmp/rules alertmanager_url: {{ .Values.monitoring.alertmanager.address }} remote_write: enabled: true client: url: {{ .Values.monitoring.prometheus.address }}/api/v1/write
- record-and-alert-rules: 预置的recording rule,它会定期查询loki,生成metri指标,再通过remote write的方式写入prometheus。除此之外,也可在其中配置日志告警规则
apiVersion: v1 kind: ConfigMap metadata: annotations: bundle.kubegems.io/ignore-options: OnUpdate name: kubegems-loki-rules namespace: {{ .Release.Namespace }} data: kubegems-loki-recording-rules.yaml: |- groups: - interval: 1m name: kubegems-loki-recording-rules rules: - expr: sum(count_over_time({stream=~"stdout|stderr"}[1m])) by (namespace, pod, container) record: gems_loki_logs_count_last_1m - expr: sum(count_over_time({stream=~"stdout|stderr"} |~ `error` [1m])) by (namespace, pod, container) record: gems_loki_error_logs_count_last_1m
见 https://www.kubegems.io/docs/category/%E6%97%A5%E5%BF%97%E4%B8%AD%E5%BF%83
- 精简模式,直接采集整个namespace所有容器的日志
apiVersion: logging.banzaicloud.io/v1beta1 kind: Flow metadata: name: default namespace: mynamespace spec: filters: - prometheus: labels: container: $.kubernetes.container_name flow: default namespace: $.kubernetes.namespace_name node: $.kubernetes.host pod: $.kubernetes.pod_name metrics: - desc: Total number of log entries collected by this each flow name: gems_logging_flow_records_total type: counter globalOutputRefs: - kubegems-container-console-output
- 应用日志采集:采集指定应用日志
apiVersion: logging.banzaicloud.io/v1beta1 kind: Flow metadata: name: kubegems-eventer-flow namespace: kubegems-eventer spec: globalOutputRefs: - kubegems-container-console-output match: - select: labels: app.kubernetes.io/name: kubernetes-event-exporter
loki的日志告警与prometheus的监控告警大同小异,区别为:
监控告警的规则在prometheusrule,日志告警的规则以configmap的形式挂载进的loki,也就是配管namespace kubegems-logging
中的configmap kubegems-loki-rules
即可。
// 同步监控告警
func (p *AlertRuleProcessor) syncMonitorAlertRule(ctx context.Context, alertrule *models.AlertRule) error {
if err := p.syncEmailSecret(ctx, alertrule); err != nil {
return errors.Wrap(err, "sync secret failed")
}
if err := p.syncPrometheusRule(ctx, alertrule); err != nil {
return errors.Wrap(err, "sync prometheusrule failed")
}
if err := p.syncAlertmanagerConfig(ctx, alertrule); err != nil {
return errors.Wrap(err, "sync alertmanagerconfig failed")
}
return nil
}
// 同步日志告警
func (p *AlertRuleProcessor) syncLoggingAlertRule(ctx context.Context, alertrule *models.AlertRule) error {
if err := p.syncEmailSecret(ctx, alertrule); err != nil {
return errors.Wrap(err, "sync secret failed")
}
if err := p.syncLokiRules(ctx, alertrule); err != nil {
return errors.Wrap(err, "sync loki rules failed")
}
if err := p.syncAlertmanagerConfig(ctx, alertrule); err != nil {
return errors.Wrap(err, "sync alertmanagerconfig failed")
}
return nil
}
由上可以看出,日志与监控告警只有第二步不同,因此启用、禁用、黑名单、告警渠道管理也同监控告警,此处不再赘述。
https://github.com/kubegems/plugins/blob/main/plugins/eventer/templates/eventer.yaml 注意配置项:
values:
config:
kubeQPS: {{ .Values.eventer.kubeQPS }}
kubeBurst: {{ .Values.eventer.kubeBurst }}
如果我们采集集群事件时,遇到客户端限流导致事件采集终止的问题,需要适当调大上述配置
- eventer list-watch event资源
- 将event信息打印到console日志中
- fluentbit采集->fluentd->loki
注意:
- spanmetrics在以后的opentelemetry可能不再在processor中,而是迁移至
connector
- prometheus使用主动scrape而不是
remote-write
,因此我们配置了servicemonitor
serviceMonitor: enabled: true metricsEndpoints: - honorLabels: true port: metrics - honorLabels: true port: otlp-metrics
### 接入Opentelemetry
为了区分数据来源的`namespace`、`service_name`、`pod`、`node`,kubegems上的应用,需要添加以下环境变量:
```yaml
- name: OTEL_K8S_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: OTEL_K8S_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: OTEL_SERVICE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.labels['app']
- name: OTEL_K8S_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: OTEL_RESOURCE_ATTRIBUTES
value: service.name=$(OTEL_SERVICE_NAME),namespace=$(OTEL_K8S_NAMESPACE),node=$(OTEL_K8S_NODE_NAME),pod=$(OTEL_K8S_POD_NAME)
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://opentelemetry-collector.observability:4318 # grpc change to 4317 port
- name: OTEL_EXPORTER_OTLP_INSECURE
value: "true"