-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two-pass non-Dask VCF conversion #1185
Conversation
I've just added a basic plink conversion approach, which converts the HAPNEST chr21 in about 20 minutes (6 workers, 8 encode threads per worker, max of about 40 gigs of RAM per worker). It's chugging through chr2 in what looks like linearly scaling time, so something in the order of an hour. Watching on linux perf, the vast majority of the time is spent on Blosc encoding and compressing the chunks. In contrast, using the existing I'll update when it finishes to give the overall timing. |
Update - it failed after about an hour with a bunch of completely cryptic messages. |
Fix missing string bug
Closing as development has moved to https://github.com/jeromekelleher/bio2zarr |
Very much WIP - not ready for review!