Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new subworkflow: distributed computing GRIDSS subworkflow #4498

Open
4 tasks done
johnoooh opened this issue Nov 29, 2023 · 4 comments
Open
4 tasks done

new subworkflow: distributed computing GRIDSS subworkflow #4498

johnoooh opened this issue Nov 29, 2023 · 4 comments
Assignees
Labels
awaiting-feedback will be closed after 30 days new subworkflow

Comments

@johnoooh
Copy link
Contributor

Is there an existing subworkflow for this?

  • I have searched for the existing subworkflow

Is there an open PR for this?

  • I have searched for existing PRs

Is there an open issue for this?

  • I have searched for existing issues

Are you going to work on this?

  • If I'm planning to work on this subworkflow, I added myself to the Assignees to facilitate tracking who is working on the subworkflow
@johnoooh
Copy link
Contributor Author

So there is currently a GRIDSS module in nf-core/modules but this method can be slow when running very complex tumor-normal pairs. In a HPC environment distributed computing can speed up this calling. GRIDSS has commands that allow for distributed computing but they should be broken up into different modules. Unfortunately these distributed jobs should be run in the same working directory, which is something nextflow cant really do due to the nature of the work directory. Many intermediate files are created by the individual steps and they must be passed from one process to another and then renamed to the proper names. I have done it this way and it works but I am of course open to suggestions on better ways to do this.

@famosab
Copy link
Contributor

famosab commented Mar 12, 2025

@johnoooh Do you still plan to work on this? If you need input hopping on over to slack might also be worth a try! :)

@johnoooh
Copy link
Contributor Author

Oncoanalyzer now does these steps and integrated them into a subworkflow, but the modules are not in the nf-core modules repo. They are local modules for the pipeline. They need tests at least to be integrated into the nf-core repo. I'll look into doing this for the hackathon. Is this something the oncoanalyzer team would like? @scwatts what do you think? I know having it as a local module rather than a nf-core module gives you a bit more flexibility.

https://github.com/nf-core/oncoanalyser/blob/master/subworkflows/local/gridss_svprep_calling/main.nf

@scwatts
Copy link
Contributor

scwatts commented Mar 13, 2025

Hi @johnoooh, we're no longer using GRIDSS in oncoanalyser after replacing it with ESVEE - let me know if there is still interest and I can open a PR in nf-core/modules that includes the GRIDSS processes I wrote to kick off this work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-feedback will be closed after 30 days new subworkflow
Projects
Status: No status
Development

No branches or pull requests

3 participants