TensorRT: What’s New

TensorRT 8.2 includes new optimizations to run billion parameter language models in real time.

TensorRT is also integrated with PyTorch and TensorFlow.

Highlights:

  • TensorRT 8.2 - Optimizations for T5 and GPT-2 deliver real time translation and summarization with 21x faster performance vs CPUs
  • TensorRT 8.2 - Simple Python API for developers using Windows
  • Torch-TensorRT - Integration for PyTorch delivers up to 6x performance vs in-framework inference on GPUs with just 1 line of code
  • TensorFlow-TensorRT - TensorFlow Integration with TensorRT delivers up to 6x faster performance versus in-framework inference on GPUs with one line of code

Torch-TensorRT is available today in the PyTorch Container from the NGC catalog here.

TensorFlow-TensorRT is available today in the TensorFlow Container from the NGC catalog here.

TensorRT 8.2 is available freely to members of the NVIDIA Developer Program today.



You can find additional resources on the NVIDIA Developer Blog or find other TensorRT developers on the NVIDIA Developer Forum




Introductory Resources



Introductory Blog

Learn how to apply TensorRT optimizations and deploy a PyTorch model to GPUs.

Read Blog

Introductory Webinar

Watch and learn more about TensorRT 8.2 features, and tools that simplify the inference workflow.

Watch Webinar

Developer Guide

See how to get started with TensorRT in this step-by-step developer guide and API reference.

Read Guide




Additional TensorRT Resources

Framework Integrations


Conversational AI



Image and Video



Recommendation Systems



Ethical AI

NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.