Unable to gather advanced metrics, ERROR "was collected before with the same name and label values" #228

yevon · 2020-09-27T13:11:54Z

Describe the problem

When I install the advanced kube-state-metrics deployment, the dashboard for gathering metrics stops working. If I check do-agent pod logs, I see some errors, stating duplicated attribute values. I followed this guide for activating advanced metrics:

https://www.digitalocean.com/docs/kubernetes/how-to/monitor-advanced/

If I uninstall advanced-metrics or scale the pods to 0, dashboard starts working again.

Steps to reproduce

It happens with kube-state-metrics:2.0.0-alpha

Expected behavior

Be able to get advanced pod schediling metrics.

System Information

Digital Ocean managed kubernetes 1.18.8

do-agent information:

do-agent-log

2020-09-27T13:03:09.553931294Z ERROR: 2020/09/27 13:03:09 /home/do-agent/cmd/do-agent/run.go:60: failed to gather metrics: 45 error(s) occurred:
2020-09-27T13:03:09.553990229Z * collected metric "kube_daemonset_status_number_available" { gauge:<value:2 > } was collected before with the same name and label values
2020-09-27T13:03:09.554014114Z * collected metric "kube_daemonset_status_number_available" { gauge:<value:2 > } was collected before with the same name and label values
2020-09-27T13:03:09.554019932Z * collected metric "kube_daemonset_status_number_available" { gauge:<value:2 > } was collected before with the same name and label values
2020-09-27T13:03:09.554042459Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554048024Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554052770Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554057690Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554062350Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554067062Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554071742Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554076542Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554081945Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554086634Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554092869Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554097734Z * collected metric "kube_deployment_status_replicas_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554102457Z * collected metric "kube_daemonset_status_number_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554127517Z * collected metric "kube_daemonset_status_number_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554133235Z * collected metric "kube_daemonset_status_number_unavailable" { gauge:<value:0 > } was collected before with the same name and label values
2020-09-27T13:03:09.554138043Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554148110Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554153232Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:2 > } was collected before with the same name and label values
2020-09-27T13:03:09.554157955Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554162715Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554167703Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554175570Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554183135Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554219858Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554241193Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554249105Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554256065Z * collected metric "kube_deployment_status_replicas_available" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554263653Z * collected metric "kube_daemonset_status_desired_number_scheduled" { gauge:<value:2 > } was collected before with the same name and label values
2020-09-27T13:03:09.554270760Z * collected metric "kube_daemonset_status_desired_number_scheduled" { gauge:<value:2 > } was collected before with the same name and label values
2020-09-27T13:03:09.554305026Z * collected metric "kube_daemonset_status_desired_number_scheduled" { gauge:<value:2 > } was collected before with the same name and label values
2020-09-27T13:03:09.554315521Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554323804Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554331593Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554338594Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:2 > } was collected before with the same name and label values
2020-09-27T13:03:09.554345304Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554362434Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554397076Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554401950Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554406640Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554411351Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554417625Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values
2020-09-27T13:03:09.554422403Z * collected metric "kube_deployment_spec_replicas" { gauge:<value:1 > } was collected before with the same name and label values

bsnyder788 · 2020-09-29T20:17:34Z

@yevon I will look into this soon, and see if I can reproduce it on my end! Thanks for the report!

bsnyder788 · 2020-11-02T13:55:57Z

I wasn't able to reproduce this issue @yevon. We had a similar report for Ubuntu 20.04 on disk metrics collection that I just addressed in 3.8.0. This kube metrics one would not be able to be easily ignored like that one though since these are metrics we actually want to collect. Did you ever dig any deeper on your end?

yevon · 2020-11-02T14:52:00Z

I didn't check again with latest version, but I reproduced this issue exactly in another kubernetes cluster in another zone. Just a new kubernetes cluster and following DO documentation for installing advanced metrics. It might be fixed within latest advanced metrics or kubernetes version, when I have time to check it again I will let you know.

stale · 2021-01-03T14:02:15Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs.

bsnyder788 · 2021-01-04T18:29:15Z

still valid

stale · 2021-06-04T15:54:49Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs.

bsnyder788 · 2021-06-04T16:00:24Z

still valid

blockloop · 2021-07-02T17:20:27Z

@bsnyder788 if you add the bug tag to this bug then the stale bot will stop marking it as stale. I believe that's the correct tag, but you can look it up in the stale bot settings for this repo.

vagkaefer · 2024-05-16T19:18:10Z

Any news on this? I have the same situation with a Cloudlinux server. It was working normally and I started getting these errors:

May 16 15:58:44 srv-001 DigitalOceanAgent[996139]: * collected metric "node_filesystem_free_bytes" { label:<name:"device" value:"/dev/vda1" > label:<name:"fstype" value:"xfs" > label:<name:"mountpoint" value:"/usr/share/cagefs-skeleton/opt" > gauge:<value:3.9912136704e+10 > } was collected before with the same name and label values
May 16 15:58:44 srv-001 DigitalOceanAgent[996139]: * collected metric "node_filesystem_size_bytes" { label:<name:"device" value:"/dev/vda1" > label:<name:"fstype" value:"xfs" > label:<name:"mountpoint" value:"/usr/share/cagefs-skeleton/usr/local/apache/domlogs" > gauge:<value:6.3254278144e+10 > } was collected before with the same name and label values

vagkaefer · 2024-10-04T04:59:44Z

At first I managed to solve this problem here with the Cloudlinux server.

Check if the do-agent user exists, because the script runs by default with this user
Check if the user is not isolated in CageFS, if so, you can remove it from CageFS
I don't know how interesting it is, but running the service as root also solves the problem...

yevon changed the title ~~Unable to gather advanced metrics~~ Unable to gather advanced metrics, ERROR "was collected before with the same name and label values" Sep 27, 2020

bsnyder788 mentioned this issue Oct 14, 2020

do-agent process constant high CPU usage #233

Closed

stale bot added the stale label Jan 3, 2021

stale bot removed the stale label Jan 4, 2021

stale bot added the stale label Jun 4, 2021

stale bot removed the stale label Jun 4, 2021

bsnyder788 added the bug label Jul 6, 2021

vagkaefer mentioned this issue May 16, 2024

Support to Cloudlinux digitalocean/droplet-agent#131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to gather advanced metrics, ERROR "was collected before with the same name and label values" #228

Unable to gather advanced metrics, ERROR "was collected before with the same name and label values" #228

yevon commented Sep 27, 2020

bsnyder788 commented Sep 29, 2020

bsnyder788 commented Nov 2, 2020

yevon commented Nov 2, 2020

stale bot commented Jan 3, 2021

bsnyder788 commented Jan 4, 2021

stale bot commented Jun 4, 2021

bsnyder788 commented Jun 4, 2021

blockloop commented Jul 2, 2021

vagkaefer commented May 16, 2024

vagkaefer commented Oct 4, 2024

Unable to gather advanced metrics, ERROR "was collected before with the same name and label values" #228

Unable to gather advanced metrics, ERROR "was collected before with the same name and label values" #228

Comments

yevon commented Sep 27, 2020

Describe the problem

Steps to reproduce

Expected behavior

System Information

do-agent information:

do-agent-log

bsnyder788 commented Sep 29, 2020

bsnyder788 commented Nov 2, 2020

yevon commented Nov 2, 2020

stale bot commented Jan 3, 2021

bsnyder788 commented Jan 4, 2021

stale bot commented Jun 4, 2021

bsnyder788 commented Jun 4, 2021

blockloop commented Jul 2, 2021

vagkaefer commented May 16, 2024

vagkaefer commented Oct 4, 2024