Skip to content

Releases: NVIDIA/TensorRT

TensorRT OSS v8.4.3

19 Aug 22:51
Compare
Choose a tag to compare

TensorRT OSS release corresponding to TensorRT 8.4.3.1 release.

Key Updates:

  • Python packages for Python 3.10.
  • Bug fix for potential overlaps in H2D and inference execution in trtexec.

22.08

17 Aug 00:14
Compare
Choose a tag to compare

Commit used by the 22.08 TensorRT NGC container.

Changelog

Updated TensorRT version to 8.4.2 - see the TensorRT 8.4.2 release notes for more information

Changed

  • Updated default protobuf version to 3.20.x
  • Updated ONNX-TensorRT submodule version to 22.08 tag
  • Updated sampleIOFormats and sampleAlgorithmSelector to use ONNX models over Caffe

Fixes

  • Fixed missing serialization member in CustomClipPlugin plugin
  • Fixed various Python import issues

Added

  • Added new DeBERTA demo
  • Added version 2 for disentangledAttentionPlugin to support DeBERTA v2

Removed

  • None

22.07

22 Jul 02:46
Compare
Choose a tag to compare

Commit used by the 22.07 TensorRT NGC container.

Changelog

Added

  • polygraphy-trtexec-plugin tool for Polygraphy
  • Multi-profile support for demoBERT
  • KV cache support for HF BART demo

Changed

  • Updated ONNX-GS to v0.3.20

Removed

  • None

TensorRT OSS v8.4.1 GA

14 Jun 21:25
Compare
Choose a tag to compare

TensorRT OSS release corresponding to TensorRT 8.4.1.5 GA release.

Key Features and Updates:

  • Samples enhancements

  • EfficientDet sample

    • Added support for EfficientDet Lite and AdvProp models.
    • Added dynamic batch support.
    • Added mixed precision engine builder.
  • HuggingFace transformer demo

    • Added BART model.
    • Performance speedup of GPT-2 greedy search using GPU implementation.
    • Fixed GPT2 onnx export failure due to 2G file size limitation.
    • Extended Megatron LayerNorm plugins to support larger hidden sizes.
    • Added performance benchmarking mode.
    • Enable tf32 format by default.
  • demoBERT enhancements

    • Add --duration flag to perf benchmarking script.
    • Fixed import of nvinfer_plugins library in demoBERT on Windows.
  • Torch-QAT toolkit

    • quant_bert.py module removed. It is now upstreamed to HuggingFace QDQBERT.
    • Use axis0 as default for deconv.
    • #1939 - Fixed path in classification_flow example.
  • Plugin enhancements

  • Build containers

    • Updated default cuda versions to 11.6.2.
    • CentOS Linux 8 has reached End-of-Life on Dec 31, 2021. The corresponding container has been removed from TensorRT-OSS.
    • Install devtoolset-8 for updated g++ versions in CentOS7 container.
  • Tooling enhancements

  • trtexec enhancements

    • Added --layerPrecisions and --layerOutputTypes flags for specifying layer-wise precision and output type constraints.
    • Added --memPoolSize flag to specify the size of workspace as well as the DLA memory pools via a unified interface. Correspondingly the --workspace flag has been deprecated.
    • "End-To-End Host Latency" metric has been removed. Use the “Host Latency” metric instead. For more information, refer to Benchmarking Network section in the TensorRT Developer Guide.
    • Use enqueueV2() instead of enqueue() when engine has explicit batch dimensions.

22.06

09 Jun 02:54
Compare
Choose a tag to compare

Commit used by the 22.06 TensorRT NGC container.

Changelog

Added

  • None

Changed

  • Disentangled attention (DMHA) plugin refactored
  • ONNX parser updated to 8.2GA

Removed

  • None

22.05

13 May 21:52
Compare
Choose a tag to compare

Commit used by the 22.05 TensorRT NGC container.

Changelog

Added

  • Disentangled attention plugin for DeBERTa
  • DMHA (multiscaleDeformableAttnPlugin) plugin for DDETR
  • Performance benchmarking mode to HuggingFace demo

Changed

  • Updated base TensorRT version to 8.2.5.1
  • Updated onnx-graphsurgeon v0.3.19 CHANGELOG
  • fp16 support for pillarScatterPlugin
  • #1939 - Fixed path in quantization classification_flow
  • Fixed GPT2 onnx export failure due to 2G limitation
  • Use axis0 as default for deconv in pytorch-quantization toolkit
  • Updated onnx export script for CoordConvAC sample
  • Install devtoolset-8 for updated g++ version in CentOS7 container

Removed

  • Usage of deprecated TensorRT APIs in samples removed
  • quant_bert.py module removed from pytorch-quantization

22.04

14 Apr 01:19
Compare
Choose a tag to compare

Commit used by the 22.04 TensorRT NGC container.

Changelog

Added

  • TensorRT Engine Explorer v0.1.0 README
  • Detectron 2 Mask R-CNN R50-FPN python sample
  • Model export script for sampleOnnxMnistCoordConvAC

Changed

  • Updated base TensorRT version to 8.2.4.2
  • Updated copyright headers with SPDX identifiers
  • Updated onnx-graphsurgeon v0.3.17 CHANGELOG
  • PyramidROIAlign plugin refactor and bug fixes
  • Fixed MultilevelCropAndResize crashes on Windows
  • #1583 - sublicense ieee/half.h under Apache2
  • Updated demo/BERT performance tables for rel-8.2
  • #1774 Fix python hangs at IndexErrors when TF is imported after TensorRT
  • Various bugfixes in demos - BERT, Tacotron2 and HuggingFace GPT/T5 notebooks
  • Cleaned up sample READMEs

Removed

  • sampleNMT removed from samples

22.03

24 Mar 05:20
Compare
Choose a tag to compare

Commit used by the 22.03 TensorRT NGC container.

Changelog

Added

  • EfficientDet sample enhancements
    • Added support for EfficientDet Lite and AdvProp models.
    • Added dynamic batch support.
    • Added mixed precision engine builder.

Changed

  • Better decoupling of HuggingFace demo tests

22.02

04 Feb 18:40
Compare
Choose a tag to compare

Commit used by the 22.02 TensorRT NGC container.

Changelog

Added

Changed

  • Extend Megatron LayerNorm plugins to support larger hidden sizes
  • Refactored EfficientNMS plugin for TFTRT and added implicit batch mode support
  • Update base TensorRT version to 8.2.3.0
  • GPT-2 greedy search speedup - now runs on GPU
  • Updates to TensorRT developer tools
  • Updated ONNX parser to v8.2.3.0
  • Minor updates and bugfixes
    • Samples: TFOD, GPT-2, demo/BERT
    • Plugins: proposalPlugin, geluPlugin, bertQKVToContextPlugin, batchedNMS

Removed

  • Unused source file(s) in demo/BERT

22.01

24 Jan 23:49
Compare
Choose a tag to compare

Commit used by the 22.01 TensorRT NGC container.