[Bug] Unable to run workflows with OpenTelemetry and ddtrace #733

connected-bkiiskila · 2025-01-16T16:08:34Z

What are you really trying to do?

I'm attempting to enable tracing on my worker using the TracingInterceptor and the tracer provided by ddtrace.

Describe the bug

A change upstream with the OpenTelemetry SDK seems to have caused an issue with Temporal's sandbox environment, causing an error stating os.environ.get isn't allowed in a workflow.

The relevant change on the OpenTelemetry side is here - if I run with a version prior to 1.29 then I no longer get an error.

Minimal Reproduction

A workflow implemented on a client with the TracingInterceptor attached and a valid OpenTelemetry SDK configured, in my case I'm using ddtrace with the DD_TRACE_OTEL_ENABLED environment variable set to true. From what I can tell though the call that isn't allowed in the sandbox is coming directly from the OpenTelemetry SDK and not the ddtrace patch.

Environment/Versions

OS and processor: macOS Apple Silicon
Temporal Version: SDK 1.9.0
Running in Docker arm64

Additional context

You can see the stacktrace and error that's displayed on the Temporal UI here: https://gist.github.com/connected-bkiiskila/d11592cb86271b5f343a4d387097d418

cretz · 2025-01-16T17:40:59Z

Thanks! Hrmmm...

Looking at your stack trace, it appears at https://github.com/DataDog/dd-trace-py/blob/eddd51464e8c32db019e239c106317228b65667c/ddtrace/internal/opentelemetry/context.py#L24-L26 opentelemetry.baggage is and others are unfortunately imported at runtime which means you may be inadvertently re-importing things for every workflow run. You should be able to add import opentelemetry.baggage somewhere outside of your workflow (e.g. where you start the worker) to run the module initialization code that ddtrace is running lazily. This will also improve your performance. ddtrace (reasonably) assumes modules are cached and that imports are not rerun in other environments, but that's not the case with workflows.

Though I am a bit surprised our import opentelemetry.baggage.propagation doesn't imply this. You may also have to configure the sandbox restrictions to treat these OTel imports as pass through (https://github.com/temporalio/sdk-python?tab=readme-ov-file#passthrough-modules). It will require testing...

connected-bkiiskila · 2025-01-16T17:42:43Z

I'll give that extra import a shot and see if that works and report back

connected-bkiiskila · 2025-01-16T18:22:52Z

Hmmm, that doesn't seem to work either - only thing I'm having luck with is the downgrade to <1.29

For that passthrough documentation, would I just be flat out allowing os.environ.get or is there a way to allow specifically the call within the opentelemetry SDK?

cretz · 2025-01-16T20:25:36Z

For that passthrough documentation, would I just be flat out allowing os.environ.get or is there a way to allow specifically the call within the opentelemetry SDK?

The passthrough would be for passing through the imports, unrelated to the restrictions happening on os.environ.get. What's happening now, from your stack trace, is that from opentelemetry.baggage import get_all is being run at runtime in the workflow. We want that to have already been cached. You can probably just add this to the top of your workflow file:

with workflow.unsafe.imports_passed_through():
    import opentelemetry.baggage

But this is untested. Even if we do allow os.environ.get through the sandbox, the bigger problem is that you are reimporting OTel libraries for every workflow run which you don't want to do.

connected-bkiiskila · 2025-01-16T21:45:18Z

Seems like that <1.29 fix was a fluke and works some of the time for whatever reason so I'm likely going to have to allow the import of os.environ.

cretz · 2025-01-16T21:50:05Z

We may also be able to look into it, though it seems to be ddtrace specific in that they are importing at runtime. I do think it is worth doing your best to prevent this import from being treated as a new import for every workflow run (since that means it runs all OTel bootstrap code every run and takes up extra memory).

connected-bkiiskila · 2025-01-16T21:51:07Z

Yeah I'd like to avoid it if possible, would you suggest creating a ticket on the ddtrace side and referencing this?

cretz · 2025-01-16T22:04:14Z

ddtrace is not really doing anything wrong as far as a normal Python library goes. https://github.com/DataDog/dd-trace-py/blob/eddd51464e8c32db019e239c106317228b65667c/ddtrace/internal/opentelemetry/context.py#L24-L26 is just a problem in Temporal workflows. But we need to get those imports preloaded and passed through. You may be able to do something like:

import opentelemetry.baggage
import opentelemetry.trace

my_worker = Worker(
  ...,
  workflow_runner=SandboxedWorkflowRunner(
    restrictions=SandboxRestrictions.default.with_passthrough_modules("opentelemetry", "ddtrace")
  )
)

when you create your worker.

connected-bkiiskila added the bug Something isn't working label Jan 16, 2025

connected-bkiiskila changed the title ~~[Bug] Unable to run workflows with OpenTelemetry SDK 1.29~~ [Bug] Unable to run workflows with OpenTelemetry and ddtrace Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Unable to run workflows with OpenTelemetry and ddtrace #733

[Bug] Unable to run workflows with OpenTelemetry and ddtrace #733

connected-bkiiskila commented Jan 16, 2025

cretz commented Jan 16, 2025

connected-bkiiskila commented Jan 16, 2025

connected-bkiiskila commented Jan 16, 2025

cretz commented Jan 16, 2025

connected-bkiiskila commented Jan 16, 2025

cretz commented Jan 16, 2025 •

edited

Loading

connected-bkiiskila commented Jan 16, 2025

cretz commented Jan 16, 2025 •

edited

Loading

[Bug] Unable to run workflows with OpenTelemetry and ddtrace #733

[Bug] Unable to run workflows with OpenTelemetry and ddtrace #733

Comments

connected-bkiiskila commented Jan 16, 2025

What are you really trying to do?

Describe the bug

Minimal Reproduction

Environment/Versions

Additional context

cretz commented Jan 16, 2025

connected-bkiiskila commented Jan 16, 2025

connected-bkiiskila commented Jan 16, 2025

cretz commented Jan 16, 2025

connected-bkiiskila commented Jan 16, 2025

cretz commented Jan 16, 2025 • edited Loading

connected-bkiiskila commented Jan 16, 2025

cretz commented Jan 16, 2025 • edited Loading

cretz commented Jan 16, 2025 •

edited

Loading

cretz commented Jan 16, 2025 •

edited

Loading