Replies: 3 comments 4 replies
-
Hello.
You are right. Single Perforator installation can have agents running on multiple Kubernetes clusters. Native support for this scenario in chart is tracked in #19. In fact, starting with v0.2.3 there is already some support, but I'm not sure whether it is enough for your case.
There is support for it in the agent. Unfortunately, there are no easy ways to specify this field when using chart. I have created #35 to track this feature request.
If you don't need to distinguish nodes within one cluster (based on rack / AZ / DC / etc), then this workaround should work. |
Beta Was this translation helpful? Give feedback.
-
was trying today to deploy agents only on one of my clusters and make them talk to "central" cluster that runs storage/web/proxy/gc/offline/agents, i created small PR https://github.com/yandex/perforator/pull/36/files
then added hostname
i also issued a cert and put it to tls secret k8s obj for perforator-storage-grpc.dev-corp.com via acme so not to use a self-signed for storage server
central cluster agents work fine and talk to storage via LB and use the cert, but agents (i use latest version v0.0.3) on another cluster crashloop, some with some huge stack trace
the Running pod also doesnt talk to storage and says cert authority is unknown
see
i shared the same cert file to standalone cluster via extraDeploy object in my PR so the tls.ctr, ca.crt and tls.key are the same for both clusters.
i tried with self signed cert also and magic one pod was running and sending data just fine but others faced the same error as above. as far as i can see from code TLS is mandatory for server side so no option to test it w/o TLS |
Beta Was this translation helpful? Give feedback.
-
Regarding the scary stack trace from tcmalloc, it looks like google/tcmalloc#160. It's very strange that we haven't seen such errors in production. We will try to update tcmalloc soon to get rid of the errors. Can you provide more information about the frequency and stability of such errors to assess the impact? Do agents fail every time, or does the error occur with some probability? |
Beta Was this translation helpful? Give feedback.
-
Hello dear maintainers,
please advise i want to profile my applications across dozens (hundreds) of k8s clusters. As far as i understood i need to run only agents on those clusters and all other microservices i can run on separate cluster right ?
for agent i only need to provide a storage service url + grpc port so in my case i will probably make it type loadbalancer + grpc.
also i want to introduce a label 'cluster_name' and use it in queries but as far as i can read from the code the labels are pre-defined and possibly the candidate is to use topologyLableKey and put there (instead of a zone as for me zone doesnt make any diff tbh) the node label that contains cluster name the node belongs to ?
is there a way to introduce custom labels ? in highly distributed environments that would be really nice (possible candidates: env (dev/prod/stage), dc (america, africa, europe, australia, etc), cluster_name) ?
thanks
Beta Was this translation helpful? Give feedback.
All reactions