-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faet(sentry apps): Add context manager for sentry apps and impl for event webhooks #86136
base: master
Are you sure you want to change the base?
Conversation
logger.info("process_resource_change.event_missing_event", extra=extra) | ||
with SentryAppInteractionEvent( | ||
operation_type=SentryAppInteractionType.PREPARE_WEBHOOK, | ||
event_type=f"process_resource_change.{sender}_{action}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dont have an 'event' string at this point so making something up
lifecycle.record_failure(e) | ||
return None | ||
except (ApiHostError, ApiTimeoutError, RequestException, ClientError) as e: | ||
lifecycle.record_halt(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ApiHostError
& ApiTimeoutError
are raised from send_and_save_webhook_request
when the response is 503 or 504.
RequestExceptions
occur when we get a Timeout
or ConnectionError
, these I also woudn't consider our server being at fault.
ClientErrors
are anything <500 and also wouldn't be our fault, because it's the 3p responsibility to properly consume our requests.
We re raise since the TASK_OPTIONS
already have the retry logic specified (i.e which errors to retry or ignore)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JFYI that we will not encounter this block for NOT published apps (e.g internal & unpublished). As we only propagate up errors from send_and_save_webhook_request
for published apps.
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## master #86136 +/- ##
===========================================
+ Coverage 45.90% 87.92% +42.01%
===========================================
Files 9709 9721 +12
Lines 551059 551742 +683
Branches 21527 21401 -126
===========================================
+ Hits 252943 485095 +232152
+ Misses 297730 66263 -231467
+ Partials 386 384 -2 |
MockResponseInstance = MockResponse({}, b"{}", "", True, 200, raiseStatusFalse, None) | ||
MockResponse404 = MockResponse({}, b'{"bruh": "bruhhhhhhh"}', "", False, 404, raiseException, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to be bytes for json.loads()
lifecycle.record_failure( | ||
failure_reason="process_resource_change.event_missing_event", extra=extra | ||
) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these be raised as exceptions and caught inside process_resource_change_bound
? then they would automatically be recorded as failures, and you can pick which exceptions to retry the task with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷♀️ ehhhh, they could ? I think the benefit with the current code is it's clear if and why we're recording X outcome or retrying the task. Having process_resource_change_bound
do exception handling makes my life easier since I can just raise an error but I think this way's clearer so ye
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make your life easier c:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tl;dr i think it's better to manually record halt than failure
@@ -110,3 +112,12 @@ def assert_middleware_metrics(middleware_calls): | |||
assert end1.args[0] == EventLifecycleOutcome.SUCCESS | |||
assert start2.args[0] == EventLifecycleOutcome.STARTED | |||
assert end2.args[0] == EventLifecycleOutcome.SUCCESS | |||
|
|||
|
|||
def assert_count_of_metric(mock_record, outcome, outcome_count): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for asserting failure, it might help to be more specific by checking the message that the lifecycle is terminating with like here
sentry/src/sentry/testutils/asserts.py
Lines 73 to 80 in 6eeb123
def assert_failure_metric(mock_record, error_msg): | |
(event_failures,) = ( | |
call for call in mock_record.mock_calls if call.args[0] == EventLifecycleOutcome.FAILURE | |
) | |
if isinstance(error_msg, Exception): | |
assert isinstance(event_failures.args[1], type(error_msg)) | |
else: | |
assert event_failures.args[1] == error_msg |
currently when you assert a single failure metric, you're only asserting on the count and not the exact line of code, which could be done if you check which exception or message it failed with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can do both, currently in the tests I do assert_failure_metric
alongside assert_count_of_metric
for most of the tests, but yeah I can make that consistent and do both for all.
Per https://www.notion.so/sentry/Sentry-App-SLOs-19f8b10e4b5d805dacb9f80beccd1c65?pvs=4#1a68b10e4b5d807a80fbde13daf0ff7b
We're separating out the preparation and sending of webhooks.
Preparation
process_resource_change_bound
Sending
send_webhooks
send_resource_change_webhook
send_and_save_webhook_request
With the current implementation we will receive the following metrics:
sentry_app.prepare_webhook.{outcome}
w/event_type
: "process_resource_change.{sender}_{action}"sender
would beError
,Group
action
would be "created"sentry_app.send_webhook.{outcome}
w/event_type
:issue.created
orerror.created
- potentially recorded 2 times (is this bad?)send_resource_change_webhook
and send_and_save_webhook_request .send_resource_change_webhook
&send_webhooks
is p. minimal (also we have the logs to pinpoint failure place).