Skip to content

Latest commit

 

History

History
47 lines (33 loc) · 1.6 KB

README.md

File metadata and controls

47 lines (33 loc) · 1.6 KB

Maven Central

Overview

bgzf-randreader is a BGZF reader supports random access relative to uncompressed data.

Suppose we have a BGZF file compressed from some text:

$ echo 'The quick brown fox jumps over the lazy dog' | bgzip > test.gz

We can random access any part of it via RandomAccessBgzFile without decompression:

RandomAccessBgzFile file = new RandomAccessBgzFile(new File("test.gz"));
try {
    byte[] b = new byte[5];
    file.seek(4);
    file.read(b);
    System.out.println(new String(b));  // outputs: quick
} finally {
    file.close();   // always close it, prevent memory leak
}

Maven dependencies

To use bgzf-randreader in Maven-based projects, use following dependency:

<dependency>
    <groupId>com.vivimice</groupId>
    <artifactId>bgzf-randreader</artifactId>
    <version>1.1.1</version>
</dependency>

About BGZF

BGZF is a GZip compatible compression format. It is a block compression implemented on top of the standard gzip file format.

BGZF file can be generated from existing gzip file or any uncompressed data by bgzip utility. Any gzip compatible utility (like gunzip, zcat, zgrep, GZIPInputStream, etc.) can decompress BGZF compressed file.

On debian/Ubuntu, bgzip utility is included in tabix package.

More about BGZF: http://samtools.github.io/hts-specs/SAMv1.pdf