Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky Test]: TestInstallWithEndpointSecurityAndRemoveEndpointIntegration – Error installing fleet package: Error installing endpoint 8.11.0: runtime_exception #4158

Closed
pchila opened this issue Jan 29, 2024 · 18 comments · Fixed by elastic/kibana#177380
Assignees
Labels
flaky-test Unstable or unreliable test cases. Team:Elastic-Agent Label for the Agent team

Comments

@pchila
Copy link
Member

pchila commented Jan 29, 2024

Failing test case

TestInstallWithEndpointSecurityAndRemoveEndpointIntegration

Error message

version_conflict_engine_exception: [data_frame_transform_state_and_stats-endpoint.metadata_united-default-8.11.0]: version conflict, document already exists (current version [1])

Build

https://buildkite.com/elastic/elastic-agent/builds/6612

OS

Linux

Stacktrace and notes

There are multiple Endpoint Security integration failures in the same run but they seem related:

- TestInstallWithEndpointSecurityAndRemoveEndpointIntegration/unprotected
  
   endpoint_security_test.go:370: POST /api/fleet/package_policies
    endpoint_security_test.go:370: Error installing fleet package: Error installing endpoint 8.11.0: runtime_exception
        	Caused by:
        		version_conflict_engine_exception: [data_frame_transform_state_and_stats-endpoint.metadata_united-default-8.11.0]: version conflict, document already exists (current version [1])
        	Root causes:
        		runtime_exception: Failed to persist transform statistics for transform [endpoint.metadata_united-default-8.11.0]
    endpoint_security_test.go:371: 
        	Error Trace:	/home/ubuntu/agent/testing/integration/endpoint_security_test.go:371
        	            				/home/ubuntu/agent/testing/integration/endpoint_security_test.go:143
        	Error:      	Received unexpected error:
        	            	error installing fleet package: Error installing endpoint 8.11.0: runtime_exception
        	            		Caused by:
        	            			version_conflict_engine_exception: [data_frame_transform_state_and_stats-endpoint.metadata_united-default-8.11.0]: version conflict, document already exists (current version [1])
        	            		Root causes:
        	            			runtime_exception: Failed to persist transform statistics for transform [endpoint.metadata_united-default-8.11.0]
        	Test:       	TestInstallWithEndpointSecurityAndRemoveEndpointIntegration/unprotected
        	Messages:   	Policy Response was: kibana.PackagePolicyResponse{Item:kibana.PackagePolicy{ID:"", Revision:0, Enabled:false, Inputs:[]map[string]interface {}(nil), Package:kibana.PackagePolicyRequestPackage{Name:"", Version:""}, Namespace:"", OutputID:"", PolicyID:"", Name:"", Description:""}}
  • TestInstallWithEndpointSecurityAndRemoveEndpointIntegration/protected

    endpoint_security_test.go:370: POST /api/fleet/package_policies
      endpoint_security_test.go:370: Error installing fleet package: Error installing endpoint 8.11.0: resource_not_found_exception
          	Root causes:
          		resource_not_found_exception: Transform with id [endpoint.metadata_united-default-8.11.0] could not be found
      endpoint_security_test.go:371: 
          	Error Trace:	/home/ubuntu/agent/testing/integration/endpoint_security_test.go:371
          	            				/home/ubuntu/agent/testing/integration/endpoint_security_test.go:143
          	Error:      	Received unexpected error:
          	            	error installing fleet package: Error installing endpoint 8.11.0: resource_not_found_exception
          	            		Root causes:
          	            			resource_not_found_exception: Transform with id [endpoint.metadata_united-default-8.11.0] could not be found
          	Test:       	TestInstallWithEndpointSecurityAndRemoveEndpointIntegration/protected
          	Messages:   	Policy Response was: kibana.PackagePolicyResponse{Item:kibana.PackagePolicy{ID:"", Revision:0, Enabled:false, Inputs:[]map[string]interface {}(nil), Package:kibana.PackagePolicyRequestPackage{Name:"", Version:""}, Namespace:"", OutputID:"", PolicyID:"", Name:"", Description:""}}
    
  • TestInstallAndUnenrollWithEndpointSecurity/unprotected

        endpoint_security_test.go:257: POST /api/fleet/package_policies
      endpoint_security_test.go:257: Error installing fleet package: Error installing endpoint 8.11.0: runtime_exception
          	Caused by:
          		version_conflict_engine_exception: [data_frame_transform_state_and_stats-endpoint.metadata_current-default-8.11.0]: version conflict, document already exists (current version [1])
          	Root causes:
          		runtime_exception: Failed to persist transform statistics for transform [endpoint.metadata_current-default-8.11.0]
      endpoint_security_test.go:258: 
          	Error Trace:	/home/ubuntu/agent/testing/integration/endpoint_security_test.go:258
          	            				/home/ubuntu/agent/testing/integration/endpoint_security_test.go:115
          	Error:      	Received unexpected error:
          	            	error installing fleet package: Error installing endpoint 8.11.0: runtime_exception
          	            		Caused by:
          	            			version_conflict_engine_exception: [data_frame_transform_state_and_stats-endpoint.metadata_current-default-8.11.0]: version conflict, document already exists (current version [1])
          	            		Root causes:
          	            			runtime_exception: Failed to persist transform statistics for transform [endpoint.metadata_current-default-8.11.0]
    
@pchila pchila added Team:Elastic-Agent Label for the Agent team flaky-test Unstable or unreliable test cases. labels Jan 29, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@cmacknz
Copy link
Member

cmacknz commented Jan 29, 2024

I also see:

=== RUN   TestStandaloneUpgradeRollbackOnRestarts
    upgrade_rollback_test.go:170: 
        	Error Trace:	/home/ubuntu/agent/testing/integration/upgrade_rollback_test.go:170
        	Error:      	Received unexpected error:
        	            	error retrieving versions from Artifact API: 503: bad http status code
        	Test:       	TestStandaloneUpgradeRollbackOnRestarts
--- FAIL: TestStandaloneUpgradeRollbackOnRestarts (0.05s)

I suspect that the resource not found error might be related, perhaps this is what Fleet gives us if it can't get the package artifacts?

    endpoint_security_test.go:370: POST /api/fleet/package_policies
    endpoint_security_test.go:370: Error installing fleet package: Error installing endpoint 8.11.0: resource_not_found_exception
        	Root causes:
        		resource_not_found_exception: Transform with id [endpoint.metadata_united-default-8.11.0] could not be found
    endpoint_security_test.go:371: 
        	Error Trace:	/home/ubuntu/agent/testing/integration/endpoint_security_test.go:371
        	            				/home/ubuntu/agent/testing/integration/endpoint_security_test.go:143
        	Error:      	Received unexpected error:
        	            	error installing fleet package: Error installing endpoint 8.11.0: resource_not_found_exception
        	            		Root causes:
        	            			resource_not_found_exception: Transform with id [endpoint.metadata_united-default-8.11.0] could not be found
        	Test:       	TestInstallWithEndpointSecurityAndRemoveEndpointIntegration/protected
        	Messages:   	Policy Response was: kibana.PackagePolicyResponse{Item:kibana.PackagePolicy{ID:"", Revision:0, Enabled:false, Inputs:[]map[string]interface {}(nil), Package:kibana.PackagePolicyRequestPackage{Name:"", Version:""}, Namespace:"", OutputID:"", PolicyID:"", Name:"", Description:""}}

I suspect the version conflict errors probably follow from this incomplete package installation.

@kpollich any thoughts on these Fleet errors?

@kpollich
Copy link
Member

resource_not_found_exception: Transform with id [endpoint.metadata_united-default-8.11.0] could not be found

This indicates to me that while installing endpoint we tried to fetch or start a transform that doesn't exist. The resource_not_found error is Elasticsearch's error text for a 404.

This could even trigger from the cleanup logic where we deleted previous version transforms during an install/upgrade of a package that includes transform assets:

https://github.com/elastic/kibana/blob/ace2b78cd8edcdda10f803019716cc59d74bd58b/x-pack/plugins/fleet/server/services/epm/elasticsearch/transform/install.ts#L460-L476

My suspicion here is that asset reference to this transform is still present even though the transform was never created in this case. So, when we try to "clean up" that existing transform we get this 404.

version_conflict_engine_exception: [data_frame_transform_state_and_stats-endpoint.metadata_united-default-8.11.0]: version conflict, document already exists (current version [1])

I am not familiar enough with Elasticsearch transforms to know what this might be referring to, so I'm going to ping @qn895 who implemented almost all of Fleet's transform management logic. Maybe her expertise here can point us in the right direction 🤞

@qn895 qn895 self-assigned this Jan 30, 2024
@qn895
Copy link
Member

qn895 commented Jan 30, 2024

Currently investigating this... @pchila, is this a problem only with your PR or across multiple other PRs as well?

@cmacknz
Copy link
Member

cmacknz commented Jan 30, 2024

This happens inconsistently across different branches. Nothing in this repository changes Fleet or Kibana at all, all we are doing is installing and uninstalling packages using the package policies API.

We may have created an unintentional stress test by installing and uninstall the Elastic Defend integration rapidly, where rapidly is something like once per minute over 5 minutes. So faster than most users would ever do, but not unreasonably quickly.

@pchila
Copy link
Member Author

pchila commented Jan 30, 2024

@qn895 I only saw that error in that specific run of the PR CI build. It came up after another run where the artifact API misbehaved a lot with 503 http status codes.
As new procedure for failures in our CI I filed an issue for that but as @cmacknz was saying it may be caused by external factors.

I didn't see it happen since and the next run for the same code was green 🤷‍♂️

@qn895
Copy link
Member

qn895 commented Jan 30, 2024

@cmacknz @pchila Thanks all for the context. I'll try to reproduce it, and will reach out for questions.

@przemekwitek
Copy link

I am not familiar enough with Elasticsearch transforms to know what this might be referring to

This error is coming from the transform code:
https://github.com/elastic/elasticsearch/blob/42f3a8f6ca57e52d2856c2d3b7c59cbce9ac3689/x-pack/plugin/transform/src/main/java/org/elasticsearch/xpack/transform/persistence/IndexBasedTransformConfigManager.java#L716
It happens when there is a race condition and 2 threads try to index the new version of transform stats document at the same time.
Maybe we could handle these conflicts better in the backend. But first I need to understand what exact scenario is being executed.
Could you point me at the test scenario code? Is it in endpoint_security_test.go file?

@cmacknz
Copy link
Member

cmacknz commented Feb 1, 2024

Yes this is the function we use to install the Elastic Defend package

func installElasticDefendPackage(t *testing.T, info *define.Info, policyID string) (r kibana.PackagePolicyResponse, err error) {

This is the package template we render and POST to /api/fleet/package_policies https://github.com/elastic/elastic-agent/blob/main/testing/integration/endpoint_security_package.json.tmpl

We have several tests that enroll an agent into a new agent policy within a short time. We likely end up with Elastic Defend installed 10-20 times on the same Fleet instance with each variant of the integration in its own agent policy.

@cmacknz
Copy link
Member

cmacknz commented Feb 8, 2024

@qn895 not sure if there's been any progress here, but we uncovered a similar bug in Fleet related to concurrently installing the same package.

#4102 (comment)

I have a suspicion the root cause is probably similar here, since concurrently installing the same package many times in a short duration seems to be the pattern in our tests leading to problems on the Kibana side.

@qn895
Copy link
Member

qn895 commented Feb 8, 2024

Thanks @cmacknz for the context. I'm planning to investigate this more thoroughly tomorrow and the upcoming week. Will also sync up with @przemekwitek what's the best approach to fix this issue moving forward.

@rdner
Copy link
Member

rdner commented Feb 19, 2024

@qn895 any update on the investigation?

@qn895
Copy link
Member

qn895 commented Feb 20, 2024

Hi @rdner. Sorry I was out yesterday. This was rather hard for me to reproduce even using the steps here #4102 (comment). My methodology was to use the API to create integration policies and install the packages in rapid succession, and it looks to me there's similar errors outside of Transforms. For example, I got:

[ERROR][plugins.securitySolution] ResponseError: [endpoint:user-artifact-manifest:endpoint-manifest-v1]: version conflict, document already exists (current version [1]): version_conflict_engine_exception
        Root causes:
                version_conflict_engine_exception: [endpoint:user-artifact-manifest:endpoint-manifest-v1]: version conflict, document already exists (current version [1])
    at KibanaTransport.request (/Users/quynh/Documents/projects/kibana/node_modules/@elastic/transport/src/Transport.ts:535:17)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)

Which to me raise the question if the unintentional stress testing is potentially affecting other types of assets as well outside of Transforms. For Transforms specifically, I see two categories of errors:

  1. version_conflict_engine_exception: [data_frame_transform_state_and_stats-endpoint.metadata_current-default-8.11.0]: version conflict, document already exists (current version [1])

This will require @przemekwitek to investigate a bit further on the Elasticsearch side to see if there's anything we can do to handle this better. We can also choose to ignore this error on the Fleet Kibana's installation side, and I can create a PR for that.

  1. resource_not_found_exception: Transform with id [endpoint.metadata_united-default-8.11.0] could not be found

This is happening during the install package phase, is something that should be non-blocking. I can create a PR to better handle this scenario.

So to recap: For the short term, I will create a Kibana PR that would no longer bubble up the errors we are seeing to no longer block the uninstall/install process. That should hopefully fix some of the errors we are seeing. However, I think the process might highlight issues that affect other asset types other than Transforms. In the longer term, would also like to know what's the best practice for handling assets in Fleet in scenarios where uninstall/install is rapidly triggered.

@rdner
Copy link
Member

rdner commented Feb 20, 2024

For the short term, I will create a Kibana PR that would no longer bubble up the errors we are seeing to no longer block the uninstall/install process.

@qn895 thank you! please link your PR to this issue.

In the longer term, would also like to know what's the best practice for handling assets in Fleet in scenarios where uninstall/install is rapidly triggered.

For me, this sounds similar to another issue we had we package installation, here is a short summary #4102 (comment)

Perhaps looking into the fix for that issue can help.

@qn895
Copy link
Member

qn895 commented Feb 20, 2024

@rdner @cmacknz Created PR for it here: elastic/kibana#177380. Hopefully that should resolves this flaky test issue. Appreciate any extra testing that the team can do with this PR as I had a hard time replicating the same error even with API calls in quick sequence or concurrently:

So replicate this you need API calls to:
Create a new empty agent policy with no integrations.
Create a new installation of Elastic Defend by making a call to the package policy API for this newly created policy.

@rdner
Copy link
Member

rdner commented Mar 7, 2024

Failed again in https://buildkite.com/elastic/elastic-agent/builds/7632#018e1a5c-a65d-4d5a-9d4d-1feeca31c961

@qn895 I see that the PR is still in draft after 2 weeks, any updates?

@qn895
Copy link
Member

qn895 commented Mar 12, 2024

Sorry I was out the last two weeks. I'll open the PR up for review tomorrow, but would appreciate some additional testing before updating all the Kibana unit tests. The main goal with this PR is to isolate whether this is something that can be fixed on Kibana's end, or Elasticsearch's end.

@rdner
Copy link
Member

rdner commented Mar 15, 2024

qn895 added a commit to elastic/kibana that referenced this issue Mar 15, 2024
…block install or uninstall of package (#177380)

## Summary

This PR fixes so that certain exceptions from Transforms will no longer
cause error when installing or uninstalling a package.

```
Error installing fleet package: Error installing endpoint 8.11.0: resource_not_found_exception
      	Root causes:
      		resource_not_found_exception: Transform with id [endpoint.metadata_united-default-8.11.0] could not be found
  endpoint_security_test.go:371: 
 ```

```
Error installing fleet package: Error installing endpoint 8.11.0:
runtime_exception
      	Caused by:
version_conflict_engine_exception:
[data_frame_transform_state_and_stats-endpoint.metadata_current-default-8.11.0]:
version conflict, document already exists (current version [1])
      	Root causes:
runtime_exception: Failed to persist transform statistics for transform
[endpoint.metadata_current-default-8.11.0]
```

These errors might happen when the policies and packages are installed or uninstalled in rapid succession.

 
### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials
- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
- [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive layout. (You can test this [in your browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser compatibility](https://www.elastic.co/support/matrix#matrix_browsers)


### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

| Risk                      | Probability | Severity | Mitigation/Notes        |
|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space. | Low | High | Integration tests will verify that all features are still supported in non-default Kibana Space and when user switches between spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions when multiple Kibana nodes are polling for the same tasks. | High | Low | Tasks are idempotent, so executing them multiple times will not result in logical error, but will degrade performance. To test for this case we add plenty of unit tests around this logic and document manual testing procedure. |
| Code should gracefully handle cases when feature X or plugin Y are disabled. | Medium | High | Unit tests will verify that any feature flag or plugin combination still results in our service operational. |
| [See more potential risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

Closes elastic/elastic-agent#4158

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Mar 15, 2024
…block install or uninstall of package (elastic#177380)

## Summary

This PR fixes so that certain exceptions from Transforms will no longer
cause error when installing or uninstalling a package.

```
Error installing fleet package: Error installing endpoint 8.11.0: resource_not_found_exception
      	Root causes:
      		resource_not_found_exception: Transform with id [endpoint.metadata_united-default-8.11.0] could not be found
  endpoint_security_test.go:371:
 ```

```
Error installing fleet package: Error installing endpoint 8.11.0:
runtime_exception
      	Caused by:
version_conflict_engine_exception:
[data_frame_transform_state_and_stats-endpoint.metadata_current-default-8.11.0]:
version conflict, document already exists (current version [1])
      	Root causes:
runtime_exception: Failed to persist transform statistics for transform
[endpoint.metadata_current-default-8.11.0]
```

These errors might happen when the policies and packages are installed or uninstalled in rapid succession.

### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials
- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
- [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed
- [ ] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive layout. (You can test this [in your browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser compatibility](https://www.elastic.co/support/matrix#matrix_browsers)

### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.

When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:

| Risk                      | Probability | Severity | Mitigation/Notes        |
|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces&mdash;unexpected behavior in non-default Kibana Space. | Low | High | Integration tests will verify that all features are still supported in non-default Kibana Space and when user switches between spaces. |
| Multiple nodes&mdash;Elasticsearch polling might have race conditions when multiple Kibana nodes are polling for the same tasks. | High | Low | Tasks are idempotent, so executing them multiple times will not result in logical error, but will degrade performance. To test for this case we add plenty of unit tests around this logic and document manual testing procedure. |
| Code should gracefully handle cases when feature X or plugin Y are disabled. | Medium | High | Unit tests will verify that any feature flag or plugin combination still results in our service operational. |
| [See more potential risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |

### For maintainers

- [ ] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

Closes elastic/elastic-agent#4158

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit c65faf3)
kibanamachine referenced this issue in elastic/kibana Mar 15, 2024
…rror to not block install or uninstall of package (#177380) (#178823)

# Backport

This will backport the following commits from `main` to `8.13`:
- [[Fleet] Fixes Transform&#x27;s version conflict or not found error to
not block install or uninstall of package
(#177380)](#177380)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Quynh Nguyen
(Quinn)","email":"43350163+qn895@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-03-15T18:16:05Z","message":"[Fleet]
Fixes Transform's version conflict or not found error to not block
install or uninstall of package (#177380)\n\n## Summary\r\n\r\nThis PR
fixes so that certain exceptions from Transforms will no longer\r\ncause
error when installing or uninstalling a package.\r\n\r\n```\r\nError
installing fleet package: Error installing endpoint 8.11.0:
resource_not_found_exception\r\n \tRoot causes:\r\n
\t\tresource_not_found_exception: Transform with id
[endpoint.metadata_united-default-8.11.0] could not be found\r\n
endpoint_security_test.go:371: \r\n ```\r\n\r\n```\r\nError installing
fleet package: Error installing endpoint
8.11.0:\r\nruntime_exception\r\n \tCaused
by:\r\nversion_conflict_engine_exception:\r\n[data_frame_transform_state_and_stats-endpoint.metadata_current-default-8.11.0]:\r\nversion
conflict, document already exists (current version [1])\r\n \tRoot
causes:\r\nruntime_exception: Failed to persist transform statistics for
transform\r\n[endpoint.metadata_current-default-8.11.0]\r\n```\r\n\r\nThese
errors might happen when the policies and packages are installed or
uninstalled in rapid succession.\r\n\r\n \r\n### Checklist\r\n\r\nDelete
any items that are not applicable to this PR.\r\n\r\n- [ ] Any text
added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials\r\n- [ ]
[Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios\r\n- [ ] [Flaky
Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was used on any tests changed\r\n- [ ] Any UI touched in this PR is
usable by keyboard only (learn more about [keyboard
accessibility](https://webaim.org/techniques/keyboard/))\r\n- [ ] Any UI
touched in this PR does not create any new axe failures (run axe in
browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))\r\n-
[ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\r\n-
[ ] This renders correctly on smaller devices using a responsive layout.
(You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))\r\n-
[ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)\r\n\r\n\r\n###
Risk Matrix\r\n\r\nDelete this section if it is not applicable to this
PR.\r\n\r\nBefore closing this PR, invite QA, stakeholders, and other
developers to identify risks that should be tested prior to the
change/feature release.\r\n\r\nWhen forming the risk matrix, consider
some of the following examples and how they may potentially impact the
change:\r\n\r\n| Risk | Probability | Severity | Mitigation/Notes
|\r\n|---------------------------|-------------|----------|-------------------------|\r\n|
Multiple Spaces&mdash;unexpected behavior in non-default Kibana Space. |
Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |\r\n| Multiple nodes&mdash;Elasticsearch polling might have
race conditions when multiple Kibana nodes are polling for the same
tasks. | High | Low | Tasks are idempotent, so executing them multiple
times will not result in logical error, but will degrade performance. To
test for this case we add plenty of unit tests around this logic and
document manual testing procedure. |\r\n| Code should gracefully handle
cases when feature X or plugin Y are disabled. | Medium | High | Unit
tests will verify that any feature flag or plugin combination still
results in our service operational. |\r\n| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
|\r\n\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for
breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\nCloses
https://github.com/elastic/elastic-agent/issues/4158\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c65faf395ebf8a6a05f2bfda779e28306b1f2707","branchLabelMapping":{"^v8.14.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":[":ml","release_note:skip","Team:Fleet","v8.13.0","v8.14.0","v8.13.1"],"title":"[Fleet]
Fixes Transform's version conflict or not found error to not block
install or uninstall of
package","number":177380,"url":"https://github.com/elastic/kibana/pull/177380","mergeCommit":{"message":"[Fleet]
Fixes Transform's version conflict or not found error to not block
install or uninstall of package (#177380)\n\n## Summary\r\n\r\nThis PR
fixes so that certain exceptions from Transforms will no longer\r\ncause
error when installing or uninstalling a package.\r\n\r\n```\r\nError
installing fleet package: Error installing endpoint 8.11.0:
resource_not_found_exception\r\n \tRoot causes:\r\n
\t\tresource_not_found_exception: Transform with id
[endpoint.metadata_united-default-8.11.0] could not be found\r\n
endpoint_security_test.go:371: \r\n ```\r\n\r\n```\r\nError installing
fleet package: Error installing endpoint
8.11.0:\r\nruntime_exception\r\n \tCaused
by:\r\nversion_conflict_engine_exception:\r\n[data_frame_transform_state_and_stats-endpoint.metadata_current-default-8.11.0]:\r\nversion
conflict, document already exists (current version [1])\r\n \tRoot
causes:\r\nruntime_exception: Failed to persist transform statistics for
transform\r\n[endpoint.metadata_current-default-8.11.0]\r\n```\r\n\r\nThese
errors might happen when the policies and packages are installed or
uninstalled in rapid succession.\r\n\r\n \r\n### Checklist\r\n\r\nDelete
any items that are not applicable to this PR.\r\n\r\n- [ ] Any text
added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials\r\n- [ ]
[Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios\r\n- [ ] [Flaky
Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was used on any tests changed\r\n- [ ] Any UI touched in this PR is
usable by keyboard only (learn more about [keyboard
accessibility](https://webaim.org/techniques/keyboard/))\r\n- [ ] Any UI
touched in this PR does not create any new axe failures (run axe in
browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))\r\n-
[ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\r\n-
[ ] This renders correctly on smaller devices using a responsive layout.
(You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))\r\n-
[ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)\r\n\r\n\r\n###
Risk Matrix\r\n\r\nDelete this section if it is not applicable to this
PR.\r\n\r\nBefore closing this PR, invite QA, stakeholders, and other
developers to identify risks that should be tested prior to the
change/feature release.\r\n\r\nWhen forming the risk matrix, consider
some of the following examples and how they may potentially impact the
change:\r\n\r\n| Risk | Probability | Severity | Mitigation/Notes
|\r\n|---------------------------|-------------|----------|-------------------------|\r\n|
Multiple Spaces&mdash;unexpected behavior in non-default Kibana Space. |
Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |\r\n| Multiple nodes&mdash;Elasticsearch polling might have
race conditions when multiple Kibana nodes are polling for the same
tasks. | High | Low | Tasks are idempotent, so executing them multiple
times will not result in logical error, but will degrade performance. To
test for this case we add plenty of unit tests around this logic and
document manual testing procedure. |\r\n| Code should gracefully handle
cases when feature X or plugin Y are disabled. | Medium | High | Unit
tests will verify that any feature flag or plugin combination still
results in our service operational. |\r\n| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
|\r\n\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for
breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\nCloses
https://github.com/elastic/elastic-agent/issues/4158\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c65faf395ebf8a6a05f2bfda779e28306b1f2707"}},"sourceBranch":"main","suggestedTargetBranches":["8.13"],"targetPullRequestStates":[{"branch":"8.13","label":"v8.13.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.14.0","branchLabelMappingKey":"^v8.14.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/177380","number":177380,"mergeCommit":{"message":"[Fleet]
Fixes Transform's version conflict or not found error to not block
install or uninstall of package (#177380)\n\n## Summary\r\n\r\nThis PR
fixes so that certain exceptions from Transforms will no longer\r\ncause
error when installing or uninstalling a package.\r\n\r\n```\r\nError
installing fleet package: Error installing endpoint 8.11.0:
resource_not_found_exception\r\n \tRoot causes:\r\n
\t\tresource_not_found_exception: Transform with id
[endpoint.metadata_united-default-8.11.0] could not be found\r\n
endpoint_security_test.go:371: \r\n ```\r\n\r\n```\r\nError installing
fleet package: Error installing endpoint
8.11.0:\r\nruntime_exception\r\n \tCaused
by:\r\nversion_conflict_engine_exception:\r\n[data_frame_transform_state_and_stats-endpoint.metadata_current-default-8.11.0]:\r\nversion
conflict, document already exists (current version [1])\r\n \tRoot
causes:\r\nruntime_exception: Failed to persist transform statistics for
transform\r\n[endpoint.metadata_current-default-8.11.0]\r\n```\r\n\r\nThese
errors might happen when the policies and packages are installed or
uninstalled in rapid succession.\r\n\r\n \r\n### Checklist\r\n\r\nDelete
any items that are not applicable to this PR.\r\n\r\n- [ ] Any text
added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials\r\n- [ ]
[Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios\r\n- [ ] [Flaky
Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was used on any tests changed\r\n- [ ] Any UI touched in this PR is
usable by keyboard only (learn more about [keyboard
accessibility](https://webaim.org/techniques/keyboard/))\r\n- [ ] Any UI
touched in this PR does not create any new axe failures (run axe in
browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))\r\n-
[ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\r\n-
[ ] This renders correctly on smaller devices using a responsive layout.
(You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))\r\n-
[ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)\r\n\r\n\r\n###
Risk Matrix\r\n\r\nDelete this section if it is not applicable to this
PR.\r\n\r\nBefore closing this PR, invite QA, stakeholders, and other
developers to identify risks that should be tested prior to the
change/feature release.\r\n\r\nWhen forming the risk matrix, consider
some of the following examples and how they may potentially impact the
change:\r\n\r\n| Risk | Probability | Severity | Mitigation/Notes
|\r\n|---------------------------|-------------|----------|-------------------------|\r\n|
Multiple Spaces&mdash;unexpected behavior in non-default Kibana Space. |
Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |\r\n| Multiple nodes&mdash;Elasticsearch polling might have
race conditions when multiple Kibana nodes are polling for the same
tasks. | High | Low | Tasks are idempotent, so executing them multiple
times will not result in logical error, but will degrade performance. To
test for this case we add plenty of unit tests around this logic and
document manual testing procedure. |\r\n| Code should gracefully handle
cases when feature X or plugin Y are disabled. | Medium | High | Unit
tests will verify that any feature flag or plugin combination still
results in our service operational. |\r\n| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
|\r\n\r\n\r\n### For maintainers\r\n\r\n- [ ] This was checked for
breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\nCloses
https://github.com/elastic/elastic-agent/issues/4158\r\n\r\n---------\r\n\r\nCo-authored-by:
Kibana Machine
<42973632+kibanamachine@users.noreply.github.com>","sha":"c65faf395ebf8a6a05f2bfda779e28306b1f2707"}}]}]
BACKPORT-->

Co-authored-by: Quynh Nguyen (Quinn) <43350163+qn895@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Unstable or unreliable test cases. Team:Elastic-Agent Label for the Agent team
Projects
None yet
7 participants