Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster OreSatConfig() creation #20

Merged
merged 5 commits into from
Feb 26, 2024
Merged

Faster OreSatConfig() creation #20

merged 5 commits into from
Feb 26, 2024

Conversation

ThirteenFish
Copy link
Contributor

This contains the changes so far discussed in #14. The major features:

  • Replaces dataclasses_json with dacite. Now that this project no longer uses JSON the dataclasses_json package was only being used for dict -> dataclass conversion. The dacite package does this much faster.
  • Memoizes CardConfig.from_yaml(). Trades a bit of space for a lot of time.
  • Ensures that pyyaml uses the libyaml C bindings. This is unfortunately a bit of a manual process and until pyyaml is re-installed correctly users will see an ImportError directing them to the README for further instructions. I tried to make this a more automatic process through pip installs but my brain boggles every time I try to do complex python packaging stuff.

Profiling OreSatConfigs() shows that dataclasses_json's from_dict method
takes a significant amount of time, about 1/3rd of the total time spent.
Given that a recent timing of OreSatConfigs() on the c3 takes ~75
seconds to construct that's about 25 seconds just converting dicts to
dataclasses.

A different library, dacite, that also provides a from_dict for
dataclasses is significantly faster. After swapping libraries creating
OreSatConfigs() takes ~50 seconds, so that looks like almost all the
time spent has been removed. dacite also offers stricter type checking
as an added bonus.
A recent profile shows that CardConfig.from_yaml() takes up about 80% of
the total time spent in OreSatConfig(). Given that it takes ~50 seconds
to construct on the C3, that's ~40 seconds creating CardConfigs.

Profiling shows that CardConfig.from_yaml() is called 35 times, but
since there's only 16 yaml files and each CardConfig is supposed to
mirror each yaml file, we only need to create at most 16 unique
CardConfigs. Most of the duplicates come from _load_configs() loading
the fw_ and sw_ common.yaml files repeatedly. We could try to carefully
construct only exactly the CardConfigs needed, but there's an easy trick
to get most of the way there: memoization (not to be confused with
memorization).

Memoization is the technique of caching the output of expensive
functions and then looking up the result later when the function is
called again instead of the expensive calculation. Python makes
memoization very easy with the functools.cache decorator.

Using @cache on CardConfig.from_yaml() shows that it now only gets
called 11 times, bringing OreSatConfig() construction time down to ~9
seconds.
For performance we want to only use the C bindings version of pyyaml
(and also the docs mention that there are some unspecified minor
differences in parsing). On x86, with the requisite library installed,
this works by default, but on ARM systems (like the C3 Octavo) the
wheels provided by PyPI aren't built with the bindings. This means we
have to build them from source, which thankfully pip makes not onerously
difficult, but some amount of deviation from the standard install is
required.

This makes `OreSatConfigs()` about 2/3 faster, in my testing I saw a
reduction from 9s to 3s.

This also updates the Github actions to stop deprecation warnings and
uses the cache feature for pip to make it go a bit faster.
This is a handful of minor fixups to pyproject.toml. Since we we have
only been testing with Python 3.9 the minimum supported version has been
de facto that due to later language features sneaking in. If there are
people still on 3.7 I can work with them to set up pyenv to manage a
local install of python 3.9.

This also completes the purge of dataclasses-json

Finally it tells pylama to ignore my personal virtualenv tooling, I
didn't immediately see an easy way to do that only locally.
@ThirteenFish ThirteenFish requested a review from ryanpdx February 26, 2024 02:54
Copy link
Member

@ryanpdx ryanpdx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@ryanpdx ryanpdx merged commit e250690 into master Feb 26, 2024
1 check passed
@ryanpdx ryanpdx deleted the fast-oresatconfig branch February 26, 2024 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants