Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding yelp_clog S3LogsReader #949

Merged
merged 10 commits into from
Apr 29, 2024
Merged

Adding yelp_clog S3LogsReader #949

merged 10 commits into from
Apr 29, 2024

Conversation

yaroliakh
Copy link
Contributor

@yaroliakh yaroliakh commented Apr 11, 2024

Enabling S3LogsReader from yelp_clog as an alternative method for reading tron Paasta logs.

@yaroliakh yaroliakh changed the title Yaro/yelp clog s3reader Adding yelp_clog S3LogsReader Apr 12, 2024
tron/utils/scribereader.py Outdated Show resolved Hide resolved
tron/utils/scribereader.py Outdated Show resolved Hide resolved
log.warning("Unable to read location mapping files from disk, not returning scribereader host/port")
return None

def get_scribereader_host_and_port(ecosystem, superregion, region: str) -> Optional[Tuple[str, int]]:
# NOTE: Passing in an ecosystem of prod is not supported by scribereader
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking: is this comment still valid for the new system?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still valid, as in case of using scirbereader tailler it needs to get scribe service host and port, which will raise KeyError when called with "prod" ecosystem argument.

Comment on lines 229 to 231
# for logs that use Kafka topics with multiple partitions underneath or retrieved by S3LogsReader,
# data ordering is not guarantied - so we'll sort based on log timestamp set by producer.
lines = paasta_logs.sort_log_lines()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one day we won't need to sort things :P

that said: does this mean that scribereader/paasta logs/etc will also need to sort?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, scribereader still doesn't provide sorting as both stream_reader and stream_tailer can return unordered logs when the underlying Kafka topic has more than one partition. paasta logs does also sorting, and it's actually somehow even more similar now with PaaSTALogs class that will be used to fetch and filters logs.
Thus sorting is still needed on the client side (tron, paasta, yelp_clog, etc) until we figure out something on the backend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah gotcha - i always assumed that paasta logs was just dumping what scribereader returned - TIL :p


log.debug("Using S3LogsReader to retrieve logs")
s3_reader = S3LogsReader(ecosystem).get_log_reader(log_name=stream_name, min_date=start_date, max_date=end_date)
paasta_logs.iterate_logs(s3_reader, max_lines)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we should rename iterate_logs as we're really fetching them - not returning an iterator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also: thoughts on having this function return the log line? atm, it reads a bit weirdly since you'd expect a return value - but we're actually having the class store the log lines instead

Copy link
Contributor Author

@yaroliakh yaroliakh Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, it's not ideal but the primary reason for using class store is to avoid repeated sorting of lines when iterator is called multiple times for different readers, as in case of scribereader reader & tailler. With class store we can call sort once after all the logs are fetched. Even though with the current max_lines pretty low this is not an issue at all, I was wondering if we should increase it to reduce not so rare occurrences of This output is truncated for certain jobs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

increasing max_lines should be fine - it's more of a protection against jobs that spew gigabytes of logs and run multiple times :)

i'm hoping that at some point in the future we try to re-architect things a bit here so that we can store these logs in a more efficient way for tron to query them, but that's for future us!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're also currently only sorting once in the pre-PR code though, right?

Copy link
Contributor Author

@yaroliakh yaroliakh Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, nothing changes in this regard. With PaastaLogs just try to avoid extra code duplication with addition of the the third reader. I think we can simplify it once again with removal of scribe readers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so to rephrase:

  • one point for using class store is to sort only once when all the logs have been fetched
  • secondly, to mask the details of paasta logs sorting inside the class itself (as we should care only about lines), rather than returning output with a list of tuples and doing sorting and extracting of lines outside it

tron/utils/scribereader.py Outdated Show resolved Hide resolved
s3_reader = S3LogsReader(ecosystem).get_log_reader(log_name=stream_name, min_date=start_date, max_date=end_date)
paasta_logs.iterate_logs(s3_reader, max_lines)
else:
end_date = max_date.date() if max_date else None # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious: why are we telling mypy to ignore this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated end_date, but before it was complaining something about type mismatch

tron/utils/scribereader.py Outdated Show resolved Hide resolved
tron/utils/scribereader.py Outdated Show resolved Hide resolved
requirements.txt Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just double-checking: does check-requirements pass locally?

because of our hacky requirements setup here, it might also be worth adding the extra yelp reqs to requirements/requirements-minimal to make sure everything is pinned :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, check-requirements works locally.

adding the extra yelp reqs to requirements/requirements-minima

wouldn't it break github CI,as yelp internal packages won't be available?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, sorry - i was thinking faster than i was typing: it might also be worth temporarily adding the extra yelp reqs to requirements/requirements-minimal to make sure everything is pinned :) - temporarily adding them would let you run check-requirements and make sure everything is pinned correctly :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now. yes, there are some errors, i'll need to update it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these have been updated in yelp_package/extra_requirements_yelp.txt

@nemacysts
Copy link
Member

(also tagging some folks working on tron stuff this quarter to make sure they're aware of the logging changes)

Copy link
Member

@nemacysts nemacysts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - i can ship once you figure out the local test issues

(i'm not super enamoured with storing the logs in the class, but my objections here are mostly 'cause it doesn't read naturally to my eyes - which isn't really a reason to block :p)

@yaroliakh yaroliakh force-pushed the yaro/yelp_clog_s3reader branch from 63ab755 to e89f71b Compare April 18, 2024 12:08
@yaroliakh
Copy link
Contributor Author

@nemacysts I was struggling a bit with mypy import errors, so decided to update mypy as it seems to support smart ignore conditions in code, but eventually ended up just adding a global ignore rules into mypy.ini.
Hopefully having new mypy version won't be an issue, otherwise I can revert it back if needed

@yaroliakh yaroliakh requested a review from nemacysts April 19, 2024 08:05
@@ -7,3 +7,6 @@ pylint
pytest
pytest-asyncio
requirements-tools
types-PyYAML
types-requests<2.31.0.7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, why the upper-bound?

pytest.skip("yelp logs readers not available, skipping tests", allow_module_level=True)


def static_conf_patch(args={}):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally, mutable default arguments are something to be avoided - i think maybe something like:

def static_conf_patch(args=None):
    args = args if args is not None else {}
    return lambda arg, namespace, default=None: args.get(arg)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(might also be worth a comment here since i'm not 100% sure why this is necessary atm)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks. Primarily adding this because without explicit patching of specific staticconf values and instead using return_value= in unittest.mock.patch it causes side effects when staticconf.read is called more than once, as in case with addition of new reader functionality.

Comment on lines 21 to 22
tron.utils.scribereader.scribereader_available = False
tron.utils.scribereader.s3reader_available = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think setting these with a mock might be a tad cleaner since that would take care of setting things back without needing a finally block

@yaroliakh yaroliakh merged commit 82419de into master Apr 29, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants