Patch incomplete logic in complex reformatting script #776
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There is a logical statement in
extract_bp_list_v4()
withinreformat_CPX_bed_and_generate_script.py
that does not completely cover all possible cases.I encountered this bug during processing of a ~60k WGS callset with GATK-SV v1.0.1. This error occurred exactly once across all 24 chromosomes, so I surmise it is a pretty rare edge case.
Relevant variant information as follows:
Candidate complex insertion SV
BED coordinates:
chr5:94362620-94362621
SOURCE field:
INV_chr5:94362525-94362620
To me, this appears to be either a small (91bp) inversion represented/resolved oddly by GATK-SV, or is some kind of small inverted insertion that happens to be inserted at the same position as the right breakpoint of the inverted source segment.
Either way, this exposes a gap in the logic on lines 193-199 of
reformat_CPX_bed_and_generate_script.py
, which compares coordinates of the source and sink by looking for strictly greater than or less than inequalities, which does not cover this case where the position of the right breakpoint of the SOURCE interval is equal to the left breakpoint of the sink.I have patched this logic by converting the final logical
elif
statement withinextract_bp_list_v4()
to use a greater than or equal to inequality, which covers this case.