Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR] Disable replay in session mode when rate limit is active #3854

Merged
merged 13 commits into from
Nov 15, 2024
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

- Using MaxBreadcrumb with value 0 no longer crashes. ([#3836](https://github.com/getsentry/sentry-java/pull/3836))
- Accept manifest integer values when requiring floating values ([#3823](https://github.com/getsentry/sentry-java/pull/3823))
- Session Replay: Disable replay in session mode when rate limit is active ([#3854](https://github.com/getsentry/sentry-java/pull/3854))

## 7.16.0

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
package io.sentry.android.replay.capture

import android.graphics.Bitmap
import io.sentry.DataCategory.All
import io.sentry.DataCategory.Replay
import io.sentry.IConnectionStatusProvider.ConnectionStatus.DISCONNECTED
import io.sentry.IHub
import io.sentry.SentryLevel.DEBUG
Expand Down Expand Up @@ -78,6 +80,12 @@ internal class SessionCaptureStrategy(
bitmap?.recycle()
return
}
val rateLimiter = hub?.rateLimiter
if (rateLimiter?.isActiveForCategory(All) == true || rateLimiter?.isActiveForCategory(Replay) == true) {
options.logger.log(DEBUG, "Skipping screenshot recording, rate limit is active")
bitmap?.recycle()
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brustolin and @romtsn I think this is a different behaviour than on iOS getsentry/sentry-cocoa#4496. On iOS, when a limit for SR gets active, we drop everything until the next session starts. Please align.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in our last replay meeting, we agreed to start with the basics and stop session replay altogether for the remainder of the app session.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be more work for me to stop the entire replay than just not recording screenshots tbh. Besides, what if rate-limit is active for let's say 30seconds when we''re just starting the replay and because of that we're going to drop the entire replay? i'd prefer to do our best and start capturing from when the rate-limit is lifted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd prefer to do our best and start capturing from when the rate-limit is lifted?

That's also my opinion. This would also simplify the logic.

Copy link

@brustolin brustolin Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But your current logic is not taking into account that the envelope with segment 0 could be the one that was lost. It does not matter anymore to send to following segments, they all will be ignored in the server.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, this approach is just throwing away the already-taken screenshot. Maybe Android is different; for iOS, making the screenshot is the most expensive part of the process.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point to take into account, someone can correct me on this one, is that the most common form of rate limiting is quota exceeded. And this will only be lifted at the end of the month.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But your current logic is not taking into account that the envelope with segment 0 could be the one that was lost. It does not matter anymore to send to following segments, they all will be ignored in the server.

Why would it be lost? We're not capturing envelopes if rate limit is active right? Or do you mean if we get rate-limited upon sending the segment 0 event? In this case on Android it'd be stored in cache until rate-limit is lifted. The only case it gets lost if the cache gets rotated, but it's an edge-case enough to not handle it for now I guess.

Btw, this approach is just throwing away the already-taken screenshot. Maybe Android is different; for iOS, making the screenshot is the most expensive part of the process.

That's a good point, I should probably move this logic to where we capture the screenshot 👍

Another point to take into account, someone can correct me on this one, is that the most common form of rate limiting is quota exceeded. And this will only be lifted at the end of the month.

Probably yes, but not sure about the end of the month (could be on-demand events etc etc). I think this is also mostly relevant for big customers who have spike protection and whatnot in place, so we shouldn't think in "months" here probably :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this will only be lifted at the end of the month.

Nope, spike protection works on an hour basis: https://docs.sentry.io/pricing/quotas/spike-protection/#how-sentry-detects-spikes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have enough insight into SR. I let you two decide this.

// have to do it before submitting, otherwise if the queue is busy, the timestamp won't be
// reflecting the exact time of when it was captured
val frameTimestamp = dateProvider.currentTimeMillis
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package io.sentry.android.replay.capture

import android.graphics.Bitmap
import io.sentry.Breadcrumb
import io.sentry.DataCategory
import io.sentry.DateUtils
import io.sentry.IHub
import io.sentry.Scope
Expand All @@ -27,6 +28,7 @@ import io.sentry.rrweb.RRWebBreadcrumbEvent
import io.sentry.rrweb.RRWebMetaEvent
import io.sentry.transport.CurrentDateProvider
import io.sentry.transport.ICurrentDateProvider
import io.sentry.transport.RateLimiter
import org.junit.Rule
import org.junit.rules.TemporaryFolder
import org.mockito.ArgumentMatchers.anyInt
Expand All @@ -36,6 +38,8 @@ import org.mockito.kotlin.anyOrNull
import org.mockito.kotlin.argThat
import org.mockito.kotlin.check
import org.mockito.kotlin.doAnswer
import org.mockito.kotlin.doReturn
import org.mockito.kotlin.eq
import org.mockito.kotlin.mock
import org.mockito.kotlin.never
import org.mockito.kotlin.verify
Expand Down Expand Up @@ -69,6 +73,7 @@ class SessionCaptureStrategyTest {
doAnswer {
(it.arguments[0] as ScopeCallback).run(scope)
}.whenever(it).configureScope(any())
doReturn(mock<RateLimiter>()).whenever(it).rateLimiter
}
var persistedSegment = LinkedHashMap<String, String?>()
val replayCache = mock<ReplayCache> {
Expand Down Expand Up @@ -367,4 +372,18 @@ class SessionCaptureStrategyTest {
"the current replay cache folder is not being deleted."
)
}

@Test
fun `when rate limited does not capture screenshots`() {
val strategy = fixture.getSut()

whenever(fixture.hub.rateLimiter?.isActiveForCategory(eq(DataCategory.Replay))).thenReturn(true)

strategy.start(fixture.recorderConfig)
var called = false
strategy.onScreenshotRecorded(mock<Bitmap>()) {
called = true
}
assertFalse(called)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,9 @@ private DataCategory categoryFromItemType(SentryItemType itemType) {
if (SentryItemType.CheckIn.equals(itemType)) {
return DataCategory.Monitor;
}
if (SentryItemType.ReplayVideo.equals(itemType)) {
return DataCategory.Replay;
}

return DataCategory.Default;
}
Expand Down
2 changes: 2 additions & 0 deletions sentry/src/main/java/io/sentry/transport/RateLimiter.java
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,8 @@ private boolean isRetryAfter(final @NotNull String itemType) {
return DataCategory.Transaction;
case "check_in":
return DataCategory.Monitor;
case "replay_video":
return DataCategory.Replay;
default:
return DataCategory.Unknown;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,14 @@ import io.sentry.Hint
import io.sentry.IHub
import io.sentry.NoOpLogger
import io.sentry.ProfilingTraceData
import io.sentry.ReplayRecording
import io.sentry.Sentry
import io.sentry.SentryEnvelope
import io.sentry.SentryEnvelopeHeader
import io.sentry.SentryEnvelopeItem
import io.sentry.SentryEvent
import io.sentry.SentryOptions
import io.sentry.SentryReplayEvent
import io.sentry.SentryTracer
import io.sentry.Session
import io.sentry.TracesSamplingDecision
Expand Down Expand Up @@ -71,13 +73,14 @@ class ClientReportTest {
SentryEnvelopeItem.fromAttachment(opts.serializer, NoOpLogger.getInstance(), Attachment("{ \"number\": 10 }".toByteArray(), "log.json"), 1000),
SentryEnvelopeItem.fromProfilingTrace(ProfilingTraceData(File(""), transaction), 1000, opts.serializer),
SentryEnvelopeItem.fromCheckIn(opts.serializer, CheckIn("monitor-slug-1", CheckInStatus.ERROR)),
SentryEnvelopeItem.fromMetrics(EncodedMetrics(emptyMap()))
SentryEnvelopeItem.fromMetrics(EncodedMetrics(emptyMap())),
SentryEnvelopeItem.fromReplay(opts.serializer, opts.logger, SentryReplayEvent(), ReplayRecording(), false)
)

clientReportRecorder.recordLostEnvelope(DiscardReason.NETWORK_ERROR, envelope)

val clientReportAtEnd = clientReportRecorder.resetCountsAndGenerateClientReport()
testHelper.assertTotalCount(15, clientReportAtEnd)
testHelper.assertTotalCount(16, clientReportAtEnd)
testHelper.assertCountFor(DiscardReason.SAMPLE_RATE, DataCategory.Error, 3, clientReportAtEnd)
testHelper.assertCountFor(DiscardReason.BEFORE_SEND, DataCategory.Error, 2, clientReportAtEnd)
testHelper.assertCountFor(DiscardReason.QUEUE_OVERFLOW, DataCategory.Transaction, 1, clientReportAtEnd)
Expand All @@ -90,6 +93,7 @@ class ClientReportTest {
testHelper.assertCountFor(DiscardReason.NETWORK_ERROR, DataCategory.Profile, 1, clientReportAtEnd)
testHelper.assertCountFor(DiscardReason.NETWORK_ERROR, DataCategory.Monitor, 1, clientReportAtEnd)
testHelper.assertCountFor(DiscardReason.NETWORK_ERROR, DataCategory.MetricBucket, 1, clientReportAtEnd)
testHelper.assertCountFor(DiscardReason.NETWORK_ERROR, DataCategory.Replay, 1, clientReportAtEnd)
}

@Test
Expand Down
23 changes: 23 additions & 0 deletions sentry/src/test/java/io/sentry/transport/RateLimiterTest.kt
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,18 @@ import io.sentry.CheckIn
import io.sentry.CheckInStatus
import io.sentry.Hint
import io.sentry.IHub
import io.sentry.ILogger
import io.sentry.ISerializer
import io.sentry.NoOpLogger
import io.sentry.ProfilingTraceData
import io.sentry.ReplayRecording
import io.sentry.SentryEnvelope
import io.sentry.SentryEnvelopeHeader
import io.sentry.SentryEnvelopeItem
import io.sentry.SentryEvent
import io.sentry.SentryOptions
import io.sentry.SentryOptionsManipulator
import io.sentry.SentryReplayEvent
import io.sentry.SentryTracer
import io.sentry.Session
import io.sentry.TransactionContext
Expand Down Expand Up @@ -354,4 +357,24 @@ class RateLimiterTest {

assertTrue(rateLimiter.isAnyRateLimitActive)
}

@Test
fun `drop replay items as lost`() {
val rateLimiter = fixture.getSUT()
val hub = mock<IHub>()
whenever(hub.options).thenReturn(SentryOptions())

val replayItem = SentryEnvelopeItem.fromReplay(fixture.serializer, mock<ILogger>(), SentryReplayEvent(), ReplayRecording(), false)
val attachmentItem = SentryEnvelopeItem.fromAttachment(fixture.serializer, NoOpLogger.getInstance(), Attachment("{ \"number\": 10 }".toByteArray(), "log.json"), 1000)
val envelope = SentryEnvelope(SentryEnvelopeHeader(), arrayListOf(replayItem, attachmentItem))

rateLimiter.updateRetryAfterLimits("60:replay:key", null, 1)
val result = rateLimiter.filter(envelope, Hint())

assertNotNull(result)
assertEquals(1, result.items.toList().size)

verify(fixture.clientReportRecorder, times(1)).recordLostEnvelopeItem(eq(DiscardReason.RATELIMIT_BACKOFF), same(replayItem))
verifyNoMoreInteractions(fixture.clientReportRecorder)
}
}
Loading