Skip to content

Commit

Permalink
source-braintree-native: fix missing disputes documents
Browse files Browse the repository at this point in the history
We've received reports of missing `disputes` documents. The provided
missing `disputes` both had a `received_date` that was one day before
their `created_at`, meaning we can't assume that `received_date` is a
close approximation of `created_at` when searching Braintree.
Unfortunately, since Braintree doesn't expose `created_at` for searching
disputes, we're stuck using `received_date`. To avoid missing these type
of results, `disputes` now searches at least a two day wide date window
for new results. Assuming `received_date` and `created_at` are never
more than one day apart, this should ensure the connector doesn't miss
data for this reason again.
  • Loading branch information
Alex-Bair committed Jan 22, 2025
1 parent 3fc24e0 commit 047ceec
Showing 1 changed file with 12 additions and 4 deletions.
16 changes: 12 additions & 4 deletions source-braintree-native/source_braintree_native/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,10 @@ async def fetch_subscriptions(
yield end


def _are_same_day(start: datetime, end: datetime) -> bool:
return start.date() == end.date()


async def fetch_disputes(
braintree_gateway: BraintreeGateway,
window_size: int,
Expand All @@ -393,12 +397,16 @@ async def fetch_disputes(
window_end = log_cursor + timedelta(hours=window_size)
end = min(window_end, datetime.now(tz=UTC))

# Braintree does not let us search disputes based on the created_at field. I assume received_at is an adequate proxy
# for created_at, although received_at is less granular than created_at (date vs. datetime). We'll always receive
# results we've already seen in this search, but we filter those out client-side.
# Braintree does not let us search disputes based on the created_at field, and the received_date field is
# the best alternative that Braintree exposes for searching. Since received_date can be earlier than
# created_at, it's possible to miss records with a small enough window size when the stream is caught up to the present.
# Ex: {'id': 'dispute_1', 'received_date': '2025-01-10', 'created_at': '2025-01-11T00:50:00Z'} could be missed with
# a window size of 1 hour. To avoid missing these type of results, we move the start of the received_date search back one day.
start = log_cursor - timedelta(days=1) if _are_same_day(log_cursor, end) else log_cursor

search_result = await asyncio.to_thread(
braintree_gateway.dispute.search,
DisputeSearch.received_date.between(log_cursor, end),
DisputeSearch.received_date.between(start, end),
)

count = 0
Expand Down

0 comments on commit 047ceec

Please sign in to comment.