Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthetic monitors containing variable placeholders deployed on Elastic Agent never report #4130

Open
lucabelluccini opened this issue Jan 24, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@lucabelluccini
Copy link
Contributor

lucabelluccini commented Jan 24, 2024

Reproducible on 8.12, 8.11 and 8.10 at least.

  1. Setup an Elastic Agent managed by Fleet which will act as Private Location as described here (requires to run Elastic Agent Complete on Docker)
  2. Create a Synthetic monitor containing valid code, which implies using a variable in a templated string in JS:
    step('go', async() => { const something = 'synthetics-demo'; await page.goto(`https://elastic.github.io/${something}/`); });
    
    image
  3. Deploy the Synthetic monitor to the Private Location

Test in 8.12 (but works similarly in 8.11 and 8.10)

Add a browser monitor which uses JS Template strings with variables placeholders ${...}

  • The monitor will never run 🔴
  • The Synthetic UI will report the monitor is pending first run 🔴
  • Elastic Agent stays healthy 🟡 , but the status reports no error even if there is a unit completely ignored
  • The policy with the synthetic monitor is received by Elastic Agent as it's in pre-config.yml, but the synthetics/browser-default component is never started 🔴
  • Logs do not report ANY error regarding this even in debug 🔴
  • The policy pre-config.yml contains 🟢
          name: Test
          origin: ui
          processors:
            - add_fields:
                fields:
                    config_id: 974d3e96-f33b-4cd7-8e4d-0ad7fce4b698
                    meta:
                        space_id: default
                    monitor.fleet_managed: true
                    monitor.project.id: ""
                    monitor.project.name: ""
                target: ""
          run_from.geo.name: Test Private Location
          run_from.id: 2d861f80-3d1b-11ee-9bb6-6dcabad9705b
          schedule: '@every 1m'
          screenshots: "on"
          source.inline.script: |-
            step('go', async() => {
              const something = 'synthetics-demo';
                await page.goto(`https://elastic.github.io/${something}/`);
            });
          throttling:
            download: 5
            latency: 20
            upload: 3
          timeout: null
          type: browser
    

This might be related to elastic/kibana#169963 but here the point is Heartbeat is not even started as component, nor errors are exposed in diags or logs.

Add an HTTP Monitor

  • Adding an HTTP monitor to the same Private location will trigger the spawn of Heartbeat and HTTP monitor will be running fine. 🟢

Add another Browser monitor which doesn't make use of ${...}

  • Adding a second browser monitor without any ${...}, will make the component synthetics/browser-default spawn, but its generated configuration will only contain the Browser monitor without the ${...}. 🟢
          run_from:
            geo:
                name: Test Private Location
            id: 2d861f80-3d1b-11ee-9bb6-6dcabad9705b
          schedule: '@every 1m'
          screenshots: "on"
          source:
            inline:
                script: |-
                    step('this works', async() => {
                      const something = 'synthetics-demo';
                        await page.goto(`https://elastic.github.io/synthetics-demo/`);
                    });
          throttling:
            download: 5
            latency: 20
            upload: 3
          timeout: null
          type: browser
    

Add another Browser monitor with formatting issues

  • Adding another Browser monitor which contains this payload below (literal copy paste into the Synthetic Monitor editor in Kibana):
    step('Go somewhere',
      async () => {\r\n  await page.goto('https://google.com');\r\n});\r\n\r\n//
      Log all uncaught errors to the terminal\r\ncontext.on('weberror', webError =>
      {\r\n  console.log(`Uncaught exception: \"${webError.error()}\"`);\r\n});
    
    • Makes the Elastic Agent unhealthy (Invalid component model: rendering inputs failed: starting ${ is missing ending } as Last checkin message) 🟡
    • The logs will start to be flooded by:
      {"log.level":"error","@timestamp":"2024-01-24T18:29:01.312Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":894},"message":"applying new policy: generating component model: rendering inputs failed: starting ${ is missing ending }","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}
      {"log.level":"info","@timestamp":"2024-01-24T18:29:03.969Z","log.origin":{"file.name":"upgrade/upgrade.go","file.line":111},"message":"Source URI changed from \"https://artifacts.elastic.co/downloads/\" to \"https://artifacts.elastic.co/downloads/\"","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}
      {"log.level":"error","@timestamp":"2024-01-24T18:29:03.970Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":894},"message":"applying new policy: generating component model: rendering inputs failed: starting ${ is missing ending }","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}
      {"log.level":"info","@timestamp":"2024-01-24T18:29:06.361Z","log.origin":{"file.name":"upgrade/upgrade.go","file.line":111},"message":"Source URI changed from \"https://artifacts.elastic.co/downloads/\" to \"https://artifacts.elastic.co/downloads/\"","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}
      {"log.level":"error","@timestamp":"2024-01-24T18:29:06.363Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":894},"message":"applying new policy: generating component model: rendering inputs failed: starting ${ is missing ending }","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}
      

Given all the units will report healthy and the error is generated at coordinator level, this will make impossible to apply other changes.
I tried:

  • changing the log level of the EA - works 🟢
  • removing monitors - works 🟢
  • adding another random integration to the private location (e.g. 1password) - fails 🔴
    • the 1password integration is only in pre-config.yaml, but not in the state.yaml etc... - fails 🔴

Still, I consider the fact the variable substitution fails a "light" problem as a user shouldn't escape the JS code in the Synthetics JS code editor.


TL;DR:

I think from this issue we should have 2 outcomes:

  • [ENHANCEMENT] Elastic Agent should not attempt to replace variables on the whole policy at once and "attempt" to process the policy component by component (or unit by unit) in order to avoid a wrong input in one of the integrations to block other actions/integrations -> Isolate the integrations configurations / scoped processing of the policy
  • [BUGFIX?] Understand why the Synthetic monitor which uses the ${...} properly is completely ignored and not even reported as unhealthy or logged

FYI

  • @pierrehilbert (as discussed in our sync, for a check by your team)
  • @andrewvc (as this was discussed in an internal Slack thread of Synthetics)

CC @psanz-estc as he reported this initially

@lucabelluccini lucabelluccini added the bug Something isn't working label Jan 24, 2024
@cmacknz
Copy link
Member

cmacknz commented Jan 24, 2024

On first read I think this is the combination of these two problems:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants