Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dws: rabbit config file support #234

Merged
merged 8 commits into from
Nov 5, 2024

Conversation

jameshcorbett
Copy link
Member

@jameshcorbett jameshcorbett commented Nov 5, 2024

Moves a bunch of coral2_dws.py command-line arguments to a [rabbit] table in the Flux config.

See also flux-framework/flux-docs#286 for a description of what the config file should look like.

Fixes #220.

@jameshcorbett jameshcorbett linked an issue Nov 5, 2024 that may be closed by this pull request
Problem: a number of rabbit configuration options, such as the
maximum size of file system users can request, are set by
command-line options to the coral2-dws service.

Move the file system size options to a config file and make
coral2-dws read from it. Drop the command-line options.
Problem: a number of rabbit configuration options, such as the
number of nnfdatamovement resources to save and whether or not
to restrict the creation of persistent file systems to the
instance owner, are set by command-line options to the coral2-dws
service.

Move those parameters to a config file and drop the command-line
options.
Problem: the option to disable the draining of compute nodes that
lose connection with their local rabbit is set by a command-line
option to the coral2-dws service. It should a config file option.

Move the draining option to a config file and make coral2-dws read
from it. Drop the command-line option.
Problem: the coral2_dws script has a command-line option to set the
path of the file from which it reads JGF for the rabbit resources
it will operate. However, that is unnecessary and unhelpful because
it could instead fetch the `resource.R` KVS key, which is a more
reliable option.

Drop the command-line argument and make the script read from the
KVS.
Problem: a function receives an argparse Namespace object with a
number of variables set on it, but only uses two of them.

Pass the two variables the function needs rather than the whole
namespace object.
Problem: flux-coral2-dws takes an optional flag to set the timeout
after which it kills jobs whose workflows are stuck in
TransientCondition. However, it should a config file option.

Make coral2-dws read the parameter from the `rabbit` table in the
config. Drop the command-line option.
Problem: flux-coral2-dws takes an optional flag to set the path
to the kubeconfig file for it to use. However, it should a config
file option.

Make coral2-dws read the parameter from the `rabbit` table in the
config. Drop the command-line option.
Problem: there are no checks to ensure that the rabbit config table
is valid.

Add some simple validation.
Copy link
Member

@cmoussa1 cmoussa1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Looks like some really neat improvement moving these configuration options to a TOML file 👍

@jameshcorbett
Copy link
Member Author

Thanks! Setting MWP.

@mergify mergify bot merged commit c3131b6 into flux-framework:master Nov 5, 2024
8 checks passed
@jameshcorbett jameshcorbett deleted the rabbit-configfile branch November 7, 2024 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a rabbit config table
2 participants