Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch incomplete logic in complex reformatting script #776

Merged
merged 1 commit into from
Feb 11, 2025

Conversation

RCollins13
Copy link
Contributor

There is a logical statement in extract_bp_list_v4() within reformat_CPX_bed_and_generate_script.py that does not completely cover all possible cases.

I encountered this bug during processing of a ~60k WGS callset with GATK-SV v1.0.1. This error occurred exactly once across all 24 chromosomes, so I surmise it is a pretty rare edge case.

Relevant variant information as follows:
Candidate complex insertion SV
BED coordinates: chr5:94362620-94362621
SOURCE field: INV_chr5:94362525-94362620

To me, this appears to be either a small (91bp) inversion represented/resolved oddly by GATK-SV, or is some kind of small inverted insertion that happens to be inserted at the same position as the right breakpoint of the inverted source segment.

Either way, this exposes a gap in the logic on lines 193-199 of reformat_CPX_bed_and_generate_script.py, which compares coordinates of the source and sink by looking for strictly greater than or less than inequalities, which does not cover this case where the position of the right breakpoint of the SOURCE interval is equal to the left breakpoint of the sink.

I have patched this logic by converting the final logical elif statement within extract_bp_list_v4() to use a greater than or equal to inequality, which covers this case.

@RCollins13
Copy link
Contributor Author

@epiercehoffman this is the simple bugfix I mentioned over Talkowski slack on Feb 11. Tagging you here for your review. Thanks in advance!

Copy link
Collaborator

@epiercehoffman epiercehoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix!

Over time we've encountered a few similar edge cases in this script that have been stuck in the backlog due to dreams of porting this module to GATK. So it's great to get one of them patched in the meantime, and a good reminder that a few more are still in the queue.

@epiercehoffman epiercehoffman merged commit 22bf77e into broadinstitute:main Feb 11, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants