-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dencode-partition
fails with AssertionError
#144
Comments
This looks like empty partitions, similar to the issue we had on explode? |
I can confirm that in |
Can you run that again with Can you have a look at the ICF metadata file and see if there any partition with |
What version of bio2zarr have you here? Line 605 is nowhere near |
Can you view the icf metadata please, and show the last few partitions. Must be something odd to do with tabix indexing |
This was on |
Also it looks like you're pulling the bio2zarr code from a local directory - it would be better to install to the environment. There's regular releases for just this. |
Very puzzling... Can you give me the output of these commands please:
and
(Regretting not making the chunk indexes a simple JSON file now!) |
Ahhhh - sorry, I should have zoomed in! 🤦 |
OK, looks like it's a problem with the indexing code. We should have |
I'm going to see if I can replicate on the chr2 1000G data - we'll see how long it all takes when there's only 40 processes. |
I can't reproduce this on 1000G - seems to work fine with maximal partitioning. There's quite a lot fewer variants though. Is it just this particular partition that the error occurs on or are there others? |
There were a few hundred that failed with this error - digging into this today. |
Trying to understand what is going on here: |
I've also checked and it is 550 failing partitions spread non-contiguously from 1708-5985. |
Ah-ha, wonder if this has something to do with it. Running |
Ahhhh |
We shouldn't have overlaps here though, right? Can you give some details? |
A bunch of weird stuff here... are you sure the files are all part of the same set? |
They are all in the same folder and have the same naming convention. The 58219159 file contains variants that start at that position. |
So... that one VCF file, the |
So this is the only index file that has a bin (bin |
What's the position of the first variant in that file? |
58720256 |
OK, I'm rejigging a bunch of things to avoid using the record counts from the indexes. They're really not reliable, and not present in a lot of cases. |
Hopefully close in #164. I'm going to push out a release in a few minutes so we can test @benjeffery |
The text was updated successfully, but these errors were encountered: