You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This would mainly be used when writing a large fastq file to a data store, like S3, while still wanting to seek out specific lines from that fastq file. There would be two modifications: standardization of size,
- (2 byte) uint16: length of read ID
- (var byte) read ID (UUIDs can be used directly or a hash of the identifier can be used). Often 16 byte for UUID
- (8 byte) uint64: start position
- (4 byte) uint32: length
30 bytes in total for a typical run. If a promethion flow cell returns 10,000,000 reads, the index file will be approx 286mb.
The text was updated successfully, but these errors were encountered:
Hmm, I think static allocation of bytes might be interesting here.
- (16 byte) read ID (UUIDs can be used directly or a hash of the identifier can be used)
- (8 byte) uint64: start position
- (4 byte) uint32: length
This would allow you to statically allocate the whole index into memory - you can derive the exact number of reads from the byte length of the file, and you can statically allocate a whole bunch of things
I want a binary fastqindex similar to https://hasindu2008.github.io/slow5specs/slow5-v1.0.0.pdf
This would mainly be used when writing a large fastq file to a data store, like S3, while still wanting to seek out specific lines from that fastq file. There would be two modifications: standardization of size,
30 bytes in total for a typical run. If a promethion flow cell returns 10,000,000 reads, the index file will be approx 286mb.
The text was updated successfully, but these errors were encountered: