Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(tracing): Backfill missing sample_rand on PropagationContext #4038

Merged
merged 3 commits into from
Feb 26, 2025

Conversation

szokeasaurusrex
Copy link
Member

@szokeasaurusrex szokeasaurusrex commented Feb 11, 2025

Whenever the PropagationContext continues an incoming trace (i.e. whenever the trace_id is set, rather than being randomly generated as for a new trace), check if the sample_rand is present and valid in the incoming DSC. If the sample_rand is missing, generate it deterministically based on the trace_id and backfill it into the DSC on the PropagationContext.

When generating the backfilled sample_rand, we ensure the generated value is consistent with the incoming trace's sampling decision and sample rate, if both of these are present. Otherwise, we generate a new value in the range [0, 1).

Future PRs will address propagating the sample_rand to transactions generated with continue_trace (allowing the sample_rand to be propagated on outgoing traces), and will also allow sample_rand to be used for making sampling decisions.

Ref #3998

Copy link

codecov bot commented Feb 11, 2025

Codecov Report

Attention: Patch coverage is 88.70968% with 7 lines in your changes missing coverage. Please review.

Project coverage is 79.59%. Comparing base (5d26201) to head (cc7bc02).
Report is 1 commits behind head on master.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
sentry_sdk/tracing_utils.py 89.36% 4 Missing and 1 partial ⚠️
sentry_sdk/utils.py 71.42% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4038      +/-   ##
==========================================
+ Coverage   79.55%   79.59%   +0.03%     
==========================================
  Files         140      140              
  Lines       15523    15581      +58     
  Branches     2631     2643      +12     
==========================================
+ Hits        12349    12401      +52     
- Misses       2338     2343       +5     
- Partials      836      837       +1     
Files with missing lines Coverage Δ
sentry_sdk/scope.py 85.57% <100.00%> (+0.03%) ⬆️
sentry_sdk/tracing.py 77.73% <100.00%> (+0.13%) ⬆️
sentry_sdk/utils.py 83.99% <71.42%> (-0.11%) ⬇️
sentry_sdk/tracing_utils.py 85.50% <89.36%> (+0.45%) ⬆️

... and 1 file with indirect coverage changes

@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch 2 times, most recently from 14a70da to 0a0de06 Compare February 11, 2025 19:21
@szokeasaurusrex szokeasaurusrex changed the title feat: Ensure PropagationContext has sample_rand feat(tracing): Backfill missing sample_rand on PropagationContext Feb 11, 2025
@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch 2 times, most recently from 21bb038 to 836d5b1 Compare February 11, 2025 19:39
@szokeasaurusrex szokeasaurusrex marked this pull request as ready for review February 11, 2025 19:50
@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch from 836d5b1 to cbfc36a Compare February 12, 2025 11:43
@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch from cbfc36a to 6f66ce8 Compare February 12, 2025 11:56
Copy link
Member

@sl0thentr0py sl0thentr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some initial comments,

  • the new computation _sample_rand_transformation and corresponding tests look fine
  • for the lifecycle of which cases _fill_sample_rand might behave weirdly, I need to think a bit more

Apart from that, this PR is touching mainly the TwP use case so please do some manual testing of TwP cases in a distributed setup (JS frontend + Python backend) and post trace links / screenshots.

Copy link
Member Author

@szokeasaurusrex szokeasaurusrex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my responses. Going to work on implementing requested changes now

@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch 2 times, most recently from fec92c5 to d486a8f Compare February 13, 2025 15:38
Copy link
Contributor

@sentrivana sentrivana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two more suggestions. Please review. (And also please make sure Neel is also ok with the PR before merging)

@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch 2 times, most recently from 382ef64 to beb943b Compare February 18, 2025 15:22
@sl0thentr0py
Copy link
Member

sl0thentr0py commented Feb 18, 2025

this looks fine now, we will first review the other PRs, merge and test before merging this into master, so I'm making this a draft for now.

@sl0thentr0py sl0thentr0py marked this pull request as draft February 18, 2025 16:43
@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch 3 times, most recently from 5ffab05 to cb8e728 Compare February 24, 2025 16:18
@szokeasaurusrex szokeasaurusrex marked this pull request as ready for review February 24, 2025 16:21
@szokeasaurusrex
Copy link
Member Author

@sentrivana I have implemented the fixed-precision in this PR. Please re-review. Still need to do manual testing (will do this against the final PR in this stack)

@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch from cb8e728 to 9863506 Compare February 25, 2025 10:49
Copy link
Contributor

@sentrivana sentrivana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. I'm also not sure if the Decimal stuff is an improvement; with floats the code read much easier as we didn't have to do all the conversions (float -> decimal, decimal -> int, int -> decimal, decimal -> float again)...

Comment on lines 710 to 711
DECIMAL_0 = Decimal(0)
DECIMAL_1 = Decimal(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of these aliases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To appease the linter – these values are used in the default parameter value for the generate method. The linter complains because the values are generated once per program run, which is fine, since Decimal values are immutable. To indicate that this is intentional, we need to set them as constants (or I suppose, we could disable the lint on that line).

Comment on lines 768 to 772
if lower_int == upper_int:
# Edge case: lower_decimal < upper_decimal, but due to rounding,
# lower_int == upper_int. In this case, we return
# lower_int.scaleb(-SCALE_EXPONENT) here, since calling randrange()
# with the same lower and upper bounds will raise an error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How exactly can this happen? You write due to rounding, but we're only working with decimals/integers here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the case where lower_decimal = Decimal("0.500_000_0") and upper_decimal = Decimal("0.500_000_9").

In this case, lower_int and upper_int will both be equal to 500_000, since the seventh digit after the decimal is truncated by int.

@szokeasaurusrex
Copy link
Member Author

szokeasaurusrex commented Feb 25, 2025

I'm also not sure if the Decimal stuff is an improvement; with floats the code read much easier as we didn't have to do all the conversions (float -> decimal, decimal -> int, int -> decimal, decimal -> float again)...

I would at least keep the random generation as integer ranges, that way we have precise control to ensure the upper bound is excluded from the range. Not sure about the float conversion you mention here – I don't think we are converting the Decimal values back to float but maybe I am wrong. Agreed that just using float is easier to read; I am just hesitant since, given we want the exact 6-digit precision, float values can produce weird results

@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch 3 times, most recently from c60cb71 to 2b3f4f7 Compare February 25, 2025 14:31
Copy link
Contributor

@sentrivana sentrivana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! 👍🏻

Whenever the `PropagationContext` continues an incoming trace (i.e. whenever the `trace_id` is set, rather than being randomly generated as for a new trace), check if the `sample_rand` is present and valid in the incoming DSC. If the `sample_rand` is missing, generate it deterministically based on the `trace_id` and backfill it into the DSC on the `PropagationContext`.

When generating the backfilled `sample_rand`, we ensure the generated value is consistent with the incoming trace's sampling decision and sample rate, if both of these are present. Otherwise, we generate a new value in the range [0, 1).

Future PRs will address propagating the `sample_rand` to transactions generated with `continue_trace` (allowing the `sample_rand` to be propagated on outgoing traces), and will also allow `sample_rand` to be used for making sampling decisions.

Ref #3998
@szokeasaurusrex szokeasaurusrex force-pushed the szokeasaurusrex/sample_rand-2 branch from 2b3f4f7 to 789b065 Compare February 25, 2025 14:53
@szokeasaurusrex szokeasaurusrex dismissed sl0thentr0py’s stale review February 25, 2025 19:11

Dismissing this review, since all of the requested changes have been made, and Neel is on vacation, so he cannot approve the PR until he is back.

sentrivana pushed a commit that referenced this pull request Feb 26, 2025
Use the `sample_rand` value from an incoming trace to make sampling
decisions, rather than generating a random value. When we are the head
SDK starting a new trace, save our randomly-generated value as the
`sample_rand`, and also change the random generation logic so that the
`sample_rand` is computed deterministically based on the `trace_id`.

Depends on:
  - #4040 
  - #4038 

Closes #3998
szokeasaurusrex and others added 2 commits February 26, 2025 15:39
`continue_trace` now propagates incoming `sample_rand` values to the
transaction's baggage. Also, in the case where `sample_rand` is missing
from the incoming trace and needs to be backfilled, this change
introduces a mechanism for the backfilled value from the scope's
propagation context to be propagated to the transaction's baggage.

The transaction still does not use the `sample_rand` for making sampling
decisions; this PR only enables propagation. A future PR will add
support for reading the incoming/backfilled `sample_rand` and for using
this value to make sampling decisions.

Depends on:
  - #4038 

Ref #3998
@sentrivana sentrivana merged commit 0d23b72 into master Feb 26, 2025
142 checks passed
@sentrivana sentrivana deleted the szokeasaurusrex/sample_rand-2 branch February 26, 2025 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants