Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inconsistent]: Upgraded Fleet Server 8.18.0 BC3>9.0.0 BC3 goes offline permanently on upgraded self-managed kibana. #4474

Open
amolnater-qasource opened this issue Feb 14, 2025 · 5 comments
Labels
bug Something isn't working impact:high Short-term priority; add to current release, or definitely next. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Fleet Label for the Fleet team

Comments

@amolnater-qasource
Copy link
Collaborator

amolnater-qasource commented Feb 14, 2025

Kibana Build details:

VERSION: 9.0.0-beta1 BC3
BUILD: 83575
COMMIT: a9ae718019d3909912f81e5d388ef597929071a1

Artifact: https://staging.elastic.co/8.18.0-16a1508a/downloads/beats/elastic-agent/elastic-agent-8.18.0-windows-x86_64.zip

Preconditions:

  1. 8.18.0 BC3 Upgraded to 9.0.0 beta1 BC3 Kibana self-managed environment should be available.
  2. 8.18.0 BC3 Fleet Server should be installed on 8.18.0 BC3.

Steps to reproduce:

  1. Upgrade the 8.18.0 self-managed kibana to 9.0.0-beta1.
  2. Upgrade 8.18.0 fleet server to latest 9.0.0-beta1.
  3. Observe Fleet Server upgrades successfully and then goes permanently offline on upgrade.

Expected Result:
Upgraded Fleet Server 8.18.0 BC3>9.0.0 BC3 should remain Health on upgraded self-managed kibana.

NOTE:

  • When we reinstalled the 8.18.0 Fleet server on upgraded self-managed 9.0.0, it upgraded successfully.
  • The issue is only observed when it was installed on 8.18.0 BC3 build.
  • [Update]: The issue is not consistently reproducible.

Logs:
elastic-agent-diagnostics-2025-02-14T10-02-11Z-00.zip

Agent json:
EC2AMAZ-ED5G7SB-agent-details.zip

Screenshot:

Image

Image

Image

@amolnater-qasource amolnater-qasource added bug Something isn't working impact:high Short-term priority; add to current release, or definitely next. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Fleet Label for the Fleet team labels Feb 14, 2025
@amolnater-qasource
Copy link
Collaborator Author

@muskangulati-qasource Please review.

@muskangulati-qasource
Copy link

Secondary review is Done for this ticket!

@amolnater-qasource amolnater-qasource changed the title Upgraded Fleet Server 8.18.0 BC3>9.0.0 BC3 goes offline permanently on upgraded self-managed kibana. [Inconsistent]: Upgraded Fleet Server 8.18.0 BC3>9.0.0 BC3 goes offline permanently on upgraded self-managed kibana. Feb 14, 2025
@cmacknz
Copy link
Member

cmacknz commented Feb 14, 2025

            input-fleet-server-default-fleet-server-fleet_server-691879e0-d583-4022-bba5-61d8bb8dbd80:
                message: 'Error - could not start the HTTP server for the API: failed to listen on the named pipe \\.\pipe\UwGGXFL1il700DVAc6q-T-1Z9J1UjGMU.sock: open \\.\pipe\UwGGXFL1il700DVAc6q-T-1Z9J1UjGMU.sock: Access is denied.'
                state: 4
            output-fleet-server-default:
                message: 'Error - could not start the HTTP server for the API: failed to listen on the named pipe \\.\pipe\UwGGXFL1il700DVAc6q-T-1Z9J1UjGMU.sock: open \\.\pipe\UwGGXFL1il700DVAc6q-T-1Z9J1UjGMU.sock: Access is denied.'
                state: 4

Fleet Server not being able to create the named pipe it needs is for sure the problem here, but I'm not sure why.

@cmacknz
Copy link
Member

cmacknz commented Feb 14, 2025

@leehinman any ideas what would cause Fleet Server to get an Access is Denied error inconsistently trying to create the named pipe for its metrics server?

It's using the elastic-agent-libs makeListener not agent's SocketURLWithFallback

@cmacknz
Copy link
Member

cmacknz commented Feb 14, 2025

That this happens on upgrade makes me wonder if the previous fleet-server instance still has a reference to the pipe or something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working impact:high Short-term priority; add to current release, or definitely next. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Fleet Label for the Fleet team
Projects
None yet
Development

No branches or pull requests

3 participants