Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Input - auto discovery and leader election in fleet #4126

Open
Alphayeeeet opened this issue Jan 24, 2024 · 12 comments
Open

Prometheus Input - auto discovery and leader election in fleet #4126

Alphayeeeet opened this issue Jan 24, 2024 · 12 comments
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team

Comments

@Alphayeeeet
Copy link

Alphayeeeet commented Jan 24, 2024

Describe the enhancement:
When deploying Elastic Agent as a DeamonSet into Kubernetes, I also want to scrape metrics from custom workload. By defining labels or annotations, it can be achieved to auto discover prometheus endpoints in standalone mode. Fleet with its integrations for Prometheus and the new Prometheus Input should have the same possibilites.

As well if auto discover or any metric scraping in general is enabled, each agent in the deamon set tries to scrape those metrics. In that case, the data gets ingested multiple times. This should be avoided by using the kubernetes leader election, or even better, by load balancing the scraping tasks per distinct endpoint to a distinct agent. In that case, the load would be distributed more even.

Describe a specific use case for the enhancement or feature:
Cloud Native Monitoring either requires you, to distribute a Prometheus server instance and use remote write to a distinct elastic agent. Elastic Agent as it provides the scraping possibility, should also provide the functionality to directly scrape metrics from distributed load. Configurations of the endpoints can be achieved by using auto discover and provide the necessary endpoint information via labels/annotations.

What is the definition of done?
The Prometheus/Input integration should be extended to support the above mentioned necessities. In the Fleet UI, there should be available configurations for variables to read from labels/annotations. Also a condition needs to be set (e.g. a label which signals, that this pod has exposed metrics).

Also there should be a possibility to avoid duplicate data, when using this integration in a distributed environment. Can be used if a condition is set to get the leader status of kubernetes integration?

Please explain, what should be inserted in conditions to validate this change, and I will add those.

@cmacknz cmacknz added the Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team label Jan 24, 2024
@Alphayeeeet
Copy link
Author

Leader election is already possible in the Prometheus integration. It should be also configurable for the technical preview Prometheus Input.

Also rerouting in different datasets and namespaces should also be possible based on Kubernetes annotations, like it is possible for Kubernetes container logs.

@Alphayeeeet
Copy link
Author

@cmacknz Any updates by now?

@pierrehilbert
Copy link
Contributor

Hello @Alphayeeeet,
Sorry for the delay here.
@gizas Would you have time to have a look here?
cc @bturquet

@gizas
Copy link
Contributor

gizas commented Jul 8, 2024

Thanks @Alphayeeeet , let me try to kick off the discussion with some basic information:

  1. This is the way hints-annotations autodiscovery to be configured in elastic-agent standalone
  2. And this is the specific template for prometheus integration that needs to be either mounted or added in the inputs config of agent
  3. Then by using co.elastic.hints/package: prometheus you will be in place to use the autodiscovery for pods, the pods that have the specific annotations configured.

NOTE: The hints autodiscovery is based on annotations

For the second thing you mention the leader election, indeed there is an election mechanism for the collector type only.
This is the condition: ${kubernetes_leaderelection.leader} == true

The remote write configuration now ,does not support leader election. But you can always set conditons like condition: ${kubernetes.container.name} == 'prometheus-server' to specify where to find the prometheus container /application

Note: Conditions based autodiscovery will have bigger priority in case of hints autodiscovery
Same conditions can be used of course in the prometheus with collector configuration.

For remote write the logic is different(to push data from server) that is why the notion of leader does not have sense. But if you wish to scale elastic-agents with remote_write configured, we can have a kubernetes service where multiple agents can leave behind it and prometheus remote write to send to the k8s service. Let me know if you are also interested in this scenario and I can provide the info as well. I dont think is so relevant at the moment

Please let me know if the above are ok and match the details of your setup.

@Alphayeeeet
Copy link
Author

Hi @gizas,

Thank you for the update. Unfortunately those docs are only for standalone elastic-agents. I need the same configuration for fleet-managed elastic agents. However even if autodiscovery would be possible, we still have the reouting issue. We want to change the namespace of the metrics depending on the pod annotations like it is possible wth kubernetes.container_logs.

I hope its a bit clearer now.

@Alphayeeeet
Copy link
Author

Alphayeeeet commented Jul 8, 2024

Also another topic is adding the kubernetes.* fields. I have some issues while trying to use the add_kubernetes_metadata processor. It doesnt add any metadata at all, while i have it configured as documented here: https://www.elastic.co/guide/en/beats/metricbeat/current/add-kubernetes-metadata.html

I need the metadata fields as they're added by the Kubernetes integration. I believe that's what the processor should do.

@Alphayeeeet
Copy link
Author

Actually my desired scenario would be hints based autodiscovery in fleet managed elastic agents for logs like in #5015 and for metrics too. There should be the possibility to configure which integration/metricset is used to parse those metrics/logs afterwards. Also there should be the possibility for the generic ones (custom logs or prometheus collector) by configuring dataset and providing own ingest pipelines. If you understand what i mean, I would suggest to open a new ticket on this topic.

@gizas
Copy link
Contributor

gizas commented Jul 9, 2024

Thanks for the clarifications @Alphayeeeet .
The hints based autodiscovery is not supported in Fleet Agents. (This is an old feature here which we decided to put it currently on hold).

We want to change the namespace of the metrics depending on the pod annotations like it is possible wth kubernetes.container_logs.

The conditions autodiscovery on the other is working for both managed and standalone.
So what you want to achieve to have different namespace like kubernetes.container_logs, is to change the dataset.name like in the picture right?

Screenshot 2024-07-09 at 10 45 22 AM

If yes, the change of dataset.name is supported in prometheus since v1.3.0 (and this PR)

Also another topic is adding the kubernetes.* fields. I have some issues while trying to use the add_kubernetes_metadata processor. It doesnt add any metadata at all

In your case you should make use of the kubernetes provider to achieve the metadata enrichemnt:

Sample of Elastic agent config
   elastic-agent.yml: |-
    ...truncated...
    agent:
      .....truncated...
    providers:
     
      kubernetes:
        node: ${NODE_NAME}
        scope: node
        add_resource_metadata:
          ..truncated...
          deployment: true
  inputs:
     - id: prometheus/metrics-prometheus-${kubernetes.pod.name}-${kubernetes.container.id}
        type: prometheus/metrics
        processors:
          - add_fields:
              target: orchestrator
              fields:
                cluster.name: {{ .Values.global.cluster_name }}
                cluster.role: {{ .Values.global.cluster_role }}
                platform.type: {{ .Values.global.platform_type }}
                source.agent: elastic-agent
        use_output: metrics
        meta:
          package:
            name: prometheus
            version: 1.1.0
        streams:
          - id: prometheus-metrics-prometheus-${kubernetes.pod.name}-${kubernetes.container.id}
            condition: ${kubernetes.labels.co.elastic.hints/package} == "prometheus"
            data_stream:
              dataset: prometheus.collector
              namespace: ${kubernetes.labels.app.kubernetes.io/name|'default'}
              type: metrics
            hosts:
              - ${kubernetes.labels.co.elastic.hints/protocol|'http'}://${kubernetes.pod.ip}:${kubernetes.labels.co.elastic.hints/port|'8080'}
            metrics_filters.exclude: null
            metrics_filters.include: null
            metrics_path: /metrics
            metricsets:
              - collector
            period: ${kubernetes.labels.co.elastic.hints/period|'10s'}
            ssl.verification_mode: ${kubernetes.labels.co.elastic.hints/sslVerificationMode|'full'}
            rate_counters: true
            use_types: true
          - id: prometheus-metrics-prometheus-${kubernetes.pod.name}-${kubernetes.container.id}
            condition: ${kubernetes.annotations.co.elastic.hints/package} == "prometheus"
            data_stream:
              dataset: prometheus.collector
              namespace: ${kubernetes.labels.app.kubernetes.io/name|'default'}
              type: metrics
            hosts:
              - ${kubernetes.annotations.co.elastic.hints/protocol|'http'}://${kubernetes.pod.ip}:${kubernetes.annotations.co.elastic.hints/port|'8080'}
            metrics_filters.exclude: null
            metrics_filters.include: null
            metrics_path: /metrics
            metricsets:
              - collector
            period: ${kubernetes.annotations.co.elastic.hints/period|'10s'}
            ssl.verification_mode: ${kubernetes.annotations.co.elastic.hints/sslVerificationMode|'full'}
            rate_counters: true
            use_types: true

In the example above you can see that the namespace is configured based on the label : namespace: ${kubernetes.labels.app.kubernetes.io/name|'default'} You need to add a kubernetes. variable in the prometheus config to make the kubernetes provider able to kick the match and apply the autodiscovery

there should be the possibility to configure which integration/metricset is used to parse those metrics/logs afterwards.

For me it sounds like you are trying to achieve both logs and metrics ingestion. So you can always install 2 integrations. The one to be kubernetes integration only with logs enabled and the second to be the prometheus one and use it to collect the metrics you want

@Alphayeeeet
Copy link
Author

Thank you @gizas I will try and will raise any upcoming issues here in the conversation.

@Alphayeeeet
Copy link
Author

@gizas I would suggest moving the discussion and error part into the forum: https://discuss.elastic.co/t/fleet-managed-elastic-agent-kubernetes-prometheus-metrics-autodiscover/362808

@Alphayeeeet
Copy link
Author

@gizas Could you please provide me some assistance. I replied to the discussion, with any errors, I received after configuring the Prometheus Integration Policy.

@Alphayeeeet
Copy link
Author

@gizas As an update: I finally figured out, how to configure the Prometheus integration in my Kubernetes environment. However for the rerouting feature, special API-key permissions are needed, for which I opened a PR: elastic/integrations#10592

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

No branches or pull requests

4 participants