Frequently Asked Questions

Encoding/decoding is slow in first iteration
What is a restart interval
Decoding is too slow
Encoding different color spaces than full-range YCbCr BT.601
Optimizing encoding/decoding performance
Decoding (foreign) JPEG fails
Encoding/decoding alpha channel
- Alpha support in command-line application
- API for alpha
What are memory requirements for encoding/decoding

Encoding/decoding is slow in first iteration

Correct. This is because the there is initialization of GPUJPEG internal structures, CUDA buffers, the initialization of GPU execution pipeline as well as kernel compilation for actual device capability. The last point can be eliminated by generating code for the particular device during the compilation:

cmake -DCMAKE_CUDA_ARCHITECTURES=native -DCMAKE_BUILD_TYPE=Release ...

(all-major or all will also work but the compilation will take longer)

Ideal use case for GPUJPEG is to run for many images (ideally equal-sized).

What is a restart interval

A restart interval and related option in console application is a way to increase paralelism to allow efficient Huffman encoding and decoding on GPU. It is given by the number of MCU (minmum coded units, approximately same as macroblocks) that can be encoded or decoded independently.

For the encoder, the restart interval is given as a member variable restart_interval of struct gpujpeg_parameters. Higher value result in smaller JPEG images but slower encoding. Good values are between 8 (default set by gpujpeg_set_default_parameters()) or 16. Disabling restart intervals (setting value to 0) causes that the Huffman encoding/decoding is done on CPU (while the rest is still performed by GPU). On larger images, the restart interval can be a bit larger because there are more MCUs. gpujpegtool provides -r option (if not set, a eligible runtime-determined value is used).

For the decoder the value cannot be changed by the decoder because it is an attribute of the encoded JPEG.

Decoding is too slow

It can be an effect of not using restart intervals in the JPEG (eg. by using an encoder other than GPUJPEG). You can check number of segments with followng command:

gpujpeg -I image.jpg
[...]
Segment count: 129600 (DRI = 12)

The values in the order of magnitude in hundreds or thousands mean that the number of segments should not be a problem.

You can also benchmark and find the potential bottleneck by running:

gpujpeg -v -d image.jpg image.pnm
[...]
 -Stream Reader:         543.33 ms
 -Copy To Device:          1.26 ms
 -Huffman Decoder:         1.27 ms
 -DCT & Quantization:      4.27 ms
 -Postprocessing:          2.89 ms
 -Copy From Device:        8.43 ms
Decode Image GPU:         56.64 ms (only in-GPU processing)
Decode Image Bare:       600.00 ms (without copy to/from GPU memory)
Decode Image:            609.70 ms
Save Image:              139.37 ms

which shows duration of individual decoding steps (use -n <iter> to see duration of more iterations with the same image).

Encoding different color spaces than full-range YCbCr BT.601

For compatibility reasons, GPUJPEG produces a full-range YCbCr BT.601 with JFIF header, color space conversions are performed by the encoder if needed. There is, however, a possibility to encode also different color spaces like RGB or YCbCr BT.709 (limited range). Since JFIF supports BT.601 YCbCr or grayscale only, SPIFF (for BT.709) or Adobe (RGB) format is used in this case. Especially the first one is not widely used so it may introduce some compatibility problems when not decoded by the GPUJPEG decoder.

Usage in code is simple, just set gpujpeg_parameters::internal_color_space to required JPEG internal representation.

This can be done also with the command-line application (it just preserves input color space, it cannot be changed by the app now). The relevant option is "-N" (native):

gpujpeg -s 1920x1080 -N -e image.rgb image.jpg

Note: SPIFF is not a widely adopted format of JPEG files so is hightly probable that the decoder other than GPUJPEG won't support the picture and will ignore the color-space information.

Optimizing encoding/decoding performance

To optimze encoding/decoding performance, following features can be tweaked (in order of importance):

restart intervals turned on and set to reasonable value (see autotuning in main.c)
enabling segment info (needs to be set on encoder, speeds up decoder)
avoiding color space conversions - setting gpujpeg_parameters::internal_color_space equal to `gpujpeg_image_parameters::color_space``
reducing quality - the lower quality, the lesser work for entropy encoder and decoder

Also a very helpful thing to use the image already in GPU memory is if possible to avoid costy memory transfers.

It is also advisable to look at individual performance counters to see performance bottlenecks (parameter -v for command-line application, see relevant code in main.c to see how to use in custom code).

Decoding (foreign) JPEG fails

GPUJPEG is always capable of decoding the JPEG encoded by itself. As the standard allows much options, including progressive encoding, arithmetic coding etc., not all options are supported. Basically a baseline DCT-based Huffman-encoded JPEGs are supported. Few features of extended process are supported as well (4 Huffman tables). If the decoder is incapable of decoding the above mentioned JPEG, you are encouraged to fill a bug report.

Encoding/decoding alpha channel

Encoding is currently supported only for a single packed pixel format 444-u8-p012a (GPUJPEG_444_U8_P012A). Let us know if you'd need some other like planar format.

Alpha support in command-line application

To use with command line application, you'd need to use option -a for encode and use a pixel format that suppors 4 channels eg. RGBA, some examples:

 gpujpeg -a -e -s 1920x1080 input.rgba output.jpg
 gpujpeg -a -e input.pam output.jpg
 gpujpeg -a -e input.yuv -f 444-u8-p102a output.jpg # YUVA

For decoding, you'd need to have 4-channel JPEG, no special tweaking is needed, just using proper output pixel format, eg:

gpujpeg -d input.jpg output.rgba
gpujpeg -d input.jpg output.pam

Note: GPUJPEG produces SPIFF headers for generated JPEG files - only a few will recognize that but some do recognize component IDs ('R', 'G', 'B', 'A' and 0x01, 0x02, 0x03, 0x04).

API for alpha

Encode

Encoding alpha is quite simple, as indicated above, just set the pixel format GPUJPEG_444_U8_P0123 as gpujpeg_image_parameters::pixel_format and set subsampling to 4:4:4:4 : gpujpeg_parameters_chroma_subsampling(param, GPUJPEG_SUBSAMPLING_4444);.`

Decode

Select output pixel format either GPUJPEG_444_U8_P0123 or GPUJPEG_PIXFMT_AUTODETECT (RGB will be used if set to GPUJPEG_PIXFMT_NONE).

What are memory requirements for encoding/decoding

Currently you can count about 20 bytes of GPU memory for every pixel and component for both encode and decode, eg. for 33 Mpix 4:4:4 frame it is 7680x4320x3x20=1901 MiB. If the JPEG is 4:2:0 subsampled, the memory requirements would be halfway.

You can check the amount of required GPU memory by running (adjust the image format and parameters according to your needs):

$ gpujpegtool -v -e 1920x1080.tst /dev/null   # or output file "nul" in MSW
$ gpujpegtool -v -S -N -ai -r 16 -e 1920x1080.p_4444-u8-p0123.tst /dev/null
...
    Total GPU Memory Size:    102.6 MiB

The memory requirements may be excessive if dealing with really huge images - let us know if there is a problem with this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ.md

FAQ.md

Frequently Asked Questions

Encoding/decoding is slow in first iteration

What is a restart interval

Decoding is too slow

Encoding different color spaces than full-range YCbCr BT.601

Optimizing encoding/decoding performance

Decoding (foreign) JPEG fails

Encoding/decoding alpha channel

Alpha support in command-line application

API for alpha

Encode

Decode

What are memory requirements for encoding/decoding

Files

FAQ.md

Latest commit

History

FAQ.md

File metadata and controls

Frequently Asked Questions

Encoding/decoding is slow in first iteration

What is a restart interval

Decoding is too slow

Encoding different color spaces than full-range YCbCr BT.601

Optimizing encoding/decoding performance

Decoding (foreign) JPEG fails

Encoding/decoding alpha channel

Alpha support in command-line application

API for alpha

Encode

Decode

What are memory requirements for encoding/decoding