Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent fails to uninstall on adding multiple integrations with errors in config to agent policy. #6843

Closed
harshitgupta-qasource opened this issue Feb 13, 2025 · 9 comments
Labels
bug Something isn't working impact:high Short-term priority; add to current release, or definitely next. QA:Validated Validated by the QA Team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@harshitgupta-qasource
Copy link

Kibana Build details:

VERSION: 9.0.0-beta1
BUILD: 83575
COMMIT: a9ae718019d3909912f81e5d388ef597929071a1

Host OS and Browser version: Windows, All

Preconditions:

  1. 9.0.0-Beta1 Kibana cloud environment should be available.
  2. An agent should be installed.
  3. 7-8 different integration should be added to agent policy. (Elastic Defend, AWS, Amazon DynamoDB, Custom Logs, MySQL, Nginx, Redis, System )

Steps to reproduce:

  1. Navigate to Fleet > Agents tab.
  2. Select any Agents and add multiple integrations.
  3. Wait for 5-10 mintues.
  4. Now, try to uninstall elastic-agent from endpoint.
  5. Observe that Agent not able to uninstall after adding multiple integrations to agent policy.

Agent.json:
EC2AMAZ-4K4L5I2-agent-details.json

Agent Logs:

log.zip
elastic-agent-diagnostics-2025-02-13T09-49-11Z-00.zip

Agent Policy

elastic-agent(1).zip

Expected Result:
Agent should be uninstalled successfully on running uninstall command when adding multiple integrations with errors in config to agent policy.

Note:

  • User is not able to collect for agent diagnostics through UI too and observed errors in CLI on running collect command.

Screenshot:

  • Agent Detail Page

Image

  • Agent Policy

Image

  • Uninstall Command

Image

  • Request agent Diagnostics Tab

Image

@harshitgupta-qasource harshitgupta-qasource added bug Something isn't working impact:high Short-term priority; add to current release, or definitely next. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Feb 13, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@harshitgupta-qasource
Copy link
Author

@amolnater-qasource Kindly review

@amolnater-qasource
Copy link

Secondary review for this ticket is Done.

@cmacknz
Copy link
Member

cmacknz commented Feb 13, 2025

All of the degraded statuses are connection refused errors for example:

              "redis/metrics-redis.keyspace-a23992cd-8211-4658-82fd-8284b6e84e01": {
                "error": "Error fetching data for metricset redis.keyspace: Failed to fetch redis info for keyspaces: dial tcp 127.0.0.1:6379: connectex: No connection could be made because the target machine actively refused it.",
                "status": "DEGRADED"
              },
              "redis/metrics-redis.info-a23992cd-8211-4658-82fd-8284b6e84e01": {
                "error": "Error fetching data for metricset redis.info: failed to fetch redis info: dial tcp 127.0.0.1:6379: connectex: No connection could be made because the target machine actively refused it.",
                "status": "DEGRADED"
              }

              "mysql/metrics-mysql.status-9c3a8403-6354-48b7-ae49-6ce070649efb": {
                "error": "Error fetching data for metricset mysql.status: dial tcp 127.0.0.1:3306: connectex: No connection could be made because the target machine actively refused it.",
                "status": "DEGRADED"
              },
              "mysql/metrics-mysql.performance-9c3a8403-6354-48b7-ae49-6ce070649efb": {
                "error": "Error fetching data for metricset mysql.performance: mysql-query fetch failed: testing connection: dial tcp 127.0.0.1:3306: connectex: No connection could be made because the target machine actively refused it.",
                "status": "DEGRADED"

@cmacknz
Copy link
Member

cmacknz commented Feb 13, 2025

It looks to me like endpoint-security.exe: Operation did not complete successfully because the file contains a virus or potentially unwanted software is the root cause.

Maybe something changed in endpoint to make Microsoft Defender dislike it, I will follow up with the endpoint team.

@bjmcnic
Copy link
Contributor

bjmcnic commented Feb 13, 2025

The running endpoint is quarantining Agent's copy of the Endpoint binary that would be used to uninstall. I've been unsuccessful reproducing this so far, but I'll keep trying. Any chance you have a copy of the alert that was sent? Is this reliably reproducible in your configuration?

...
{"@timestamp":"2025-02-13T09:35:51.2447389Z","agent":{"id":"c8956ecb-5996-4116-afca-049dd6d5bfb1","type":"endpoint"},"ecs":{"version":"8.10.0"},"log":{"level":"info","origin":{"file":{"line":1286,"name":"FileScore.cpp"}}},"message":"FileScore.cpp:1286 Sending alert for [C:\\Program Files\\Elastic\\Agent\\data\\elastic-agent-9.0.0-beta1-aa8178\\components\\endpoint-security.exe]","process":{"pid":1012,"thread":{"id":7840}}}
{"@timestamp":"2025-02-13T09:35:53.883861Z","agent":{"id":"c8956ecb-5996-4116-afca-049dd6d5bfb1","type":"endpoint"},"ecs":{"version":"8.10.0"},"log":{"level":"info","origin":{"file":{"line":895,"name":"PlatformQuarantineManager.cpp"}}},"message":"PlatformQuarantineManager.cpp:895 Successfully quarantined file: C:\\Program Files\\Elastic\\Agent\\data\\elastic-agent-9.0.0-beta1-aa8178\\components\\endpoint-security.exe","process":{"pid":1012,"thread":{"id":7584}}}
...

@harshitgupta-qasource
Copy link
Author

Hi @bjmcnic,

Thanks for looking into this issue.

We haven’t executed any alert files on the endpoint.

We attempted to reproduce the issue on a fresh build, but it did not reproduce.
However, the issue remains reproducible in a specific environment with a specific policy and we encountered same error while uninstalling the agent.

Screenshot:
Image

We have shared the Kibana environment credentials over Slack for your reference.

Kindly let us know if we missed anything.
Thanks

@bjmcnic
Copy link
Contributor

bjmcnic commented Feb 14, 2025

@harshitgupta-qasource Thank you for sharing credential for your stack. The issue was quickly obvious once I logged in. It appears you've got a global blocklist configured that blocks anything signed by Elasticsearch, Inc.. The Defend endpoint-security.exe binary is signed as such, so it's being blocked. I didn't delete that blocklist entry in your environment, but I mirrored it to my own and confirmed I produced the identical log and behavior. I suspect that if you delete that blocklist entry in your environment, you'll no longer experience this issue.

@harshitgupta-qasource
Copy link
Author

Hi @bjmcnic

Sorry for the confusion.

We have revalidated the issue after removing the global blocklist entry, and we were successfully able to uninstall the agent.

Build Details

VERSION: 9.0.0-beta1 BC3
BUILD: 83575
COMMIT: a9ae718019d3909912f81e5d388ef597929071a1
Artifact: https://staging.elastic.co/9.0.0-beta1-1d59e665/summary-9.0.0-beta1.html

Screenshot:

Image

Hence, we are closing this issue and marking as QA: Validated.

Thanks

@harshitgupta-qasource harshitgupta-qasource added the QA:Validated Validated by the QA Team label Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working impact:high Short-term priority; add to current release, or definitely next. QA:Validated Validated by the QA Team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests

5 participants