How can admins enable parallel processing for qiime2 tools #58

bernt-matthias · 2024-11-19T13:23:54Z

In #47 parameters setting the number of cores have been removed from the XML (which was the right thing to do.

I'm wondering how can the number of cores now be set (by admins). Typically Galaxy tools use the GALAXY_SLOTS environment variable (e.g. here) and pass it via CLI parameter. Alternatively qiime2 tools could of course directly access the variable.

The text was updated successfully, but these errors were encountered:

ebolyen · 2024-11-19T20:33:47Z

It probably makes the most sense for q2galaxy to directly access that variable, it'll be much simpler than templating anything specifically, and since the parameters are marked with their own special type, we can handle this robustly.

I think one question is what are the exact semantics of GALAXY_SLOTS? Are these available threads/cores or something more abstract?

ebolyen · 2024-11-19T22:54:09Z

Found the docs here:
https://planemo.readthedocs.io/en/master/writing_advanced.html#developing-for-clusters-galaxy-slots-galaxy-memory-mb-and-galaxy-memory-mb-per-slot

Looks straight-forward. I think we can update action_kwargs here to add $GALAXY_SLOTS as the argument to a parameter when it has the Thread primitive type.

I think we would not do anything for our Job type, as that is better handled by Galaxy running the same action over a collection.

ebolyen · 2024-11-19T22:56:48Z

Also, out of curiosity, is there any mechanism to submit a new job from inside a galaxy job and retain some kind of reference/future to it? @Oddant1 is refactoring some stuff with our parallel processing and there's an outside chance we could make this happen if such an API existed and server admins were amenable to the concept of it.

bernt-matthias · 2024-11-20T10:04:16Z

This sounds good to me. I think it would be good to add resource requirements to the tools. Since otherwise admins (or dynamic job rules) have no means to judge which tools support parallelism.

<requirements>
   ...
   <resource type="min_cores">X</resource>
   <resource type="max_cores">Y</resource>
</requirements>

For completeness, there are more resource types .. see here.

Edit: For instance, you could add

<resource type="max_cores">1</resource> for tools not supporting parallelism
<resource type="min_cores">1</resource> for tools supporting parallelism (<resource type="max_cores">X</resource> if more than X cores would be inefficient).

bernt-matthias · 2024-11-20T10:15:58Z

Also, out of curiosity, is there any mechanism to submit a new job from inside a galaxy job and retain some kind of reference/future to it? @Oddant1 is refactoring some stuff with our parallel processing and there's an outside chance we could make this happen if such an API existed and server admins were amenable to the concept of it.

There is an API that allows to execute Galaxy tools that are installed on a Galaxy instance. But doing this from inside tools seems to be a bad idea, because it will be difficult for users to trace what has been executed. You could only run existing Galaxy tools and I think this would be much better implemented as a workflow. Also I do not know if we can assume that the Galaxy instance can be reached from the executing host.

One also needs to keep in mind the diversity of Galaxy job runners (local, SLURM, AWS, pulsar, ...). So I do not think that there can be a single mechanism and intuitively I would think that subprocesses / threads are the way to go. Might it be an option to tweak the granularity of the tools in case you need parallelism beyond a single compute node? E.g. by splitting inputs / making the jobs that are subprocesses separate tools?

ebolyen · 2024-11-20T17:36:01Z

Also I do not know if we can assume that the Galaxy instance can be reached from the executing host.

That makes sense and I think we could test for it and do something else. But this felt like a long-shot either way.

One also needs to keep in mind the diversity of Galaxy job runners (local, SLURM, AWS, pulsar, ...). So I do not think that there can be a single mechanism and intuitively I would think that subprocesses / threads are the way to go. Might it be an option to tweak the granularity of the tools in case you need parallelism beyond a single compute node? E.g. by splitting inputs / making the jobs that are subprocesses separate tools?

Yep that all makes sense. I think this just leaves us where we were anticipating. For cross-node parallelism in Galaxy, the answer is partition your data into a Collection (which maps to the Galaxy Collection) and then just do things normally. Since most of our metagenomic tools are written in this split-apply-combine style inside QIIME 2 pipeline actions, it means that these inner methods exist, so users should just be in the habit of using them directly instead of the simpler 1-shot pipeline actions.

ebolyen · 2024-11-20T17:38:54Z

Edit: For instance, you could add

<resource type="max_cores">1</resource> for tools not supporting parallelism

<resource type="min_cores">1</resource> for tools supporting parallelism (<resource type="max_cores">X</resource> if more than X cores would be inefficient).

Perfect, that should map to our Thread % Range(...) predicate, so we can populate these for all tools and provide min_cores+max_cores whenever an action indicates such on a Thread type.

q2d2 added this to QIIME 2 - Triage 🚑 Nov 19, 2024

github-project-automation bot moved this to Needs Triage in QIIME 2 - Triage 🚑 Nov 19, 2024

ebolyen moved this from Needs Triage to Awaiting Info in QIIME 2 - Triage 🚑 Nov 19, 2024

ebolyen moved this from Awaiting Info to Needs Prioritization in QIIME 2 - Triage 🚑 Nov 19, 2024

bernt-matthias mentioned this issue Nov 20, 2024

[24.2] Add requirement/resource to schema docs galaxyproject/galaxy#19172

Merged

4 tasks

lizgehret added this to 2025.4 🌻 Dec 5, 2024

github-project-automation bot moved this to Backlog in 2025.4 🌻 Dec 5, 2024

lizgehret assigned ebolyen Dec 5, 2024

lizgehret removed this from QIIME 2 - Triage 🚑 Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can admins enable parallel processing for qiime2 tools #58

How can admins enable parallel processing for qiime2 tools #58

bernt-matthias commented Nov 19, 2024

ebolyen commented Nov 19, 2024

ebolyen commented Nov 19, 2024 •

edited

Loading

ebolyen commented Nov 19, 2024

bernt-matthias commented Nov 20, 2024 •

edited

Loading

bernt-matthias commented Nov 20, 2024

ebolyen commented Nov 20, 2024

ebolyen commented Nov 20, 2024

How can admins enable parallel processing for qiime2 tools #58

How can admins enable parallel processing for qiime2 tools #58

Comments

bernt-matthias commented Nov 19, 2024

ebolyen commented Nov 19, 2024

ebolyen commented Nov 19, 2024 • edited Loading

ebolyen commented Nov 19, 2024

bernt-matthias commented Nov 20, 2024 • edited Loading

bernt-matthias commented Nov 20, 2024

ebolyen commented Nov 20, 2024

ebolyen commented Nov 20, 2024

ebolyen commented Nov 19, 2024 •

edited

Loading

bernt-matthias commented Nov 20, 2024 •

edited

Loading