Triton Inference Server

Aug 30, 2023
How to Build a Distributed Inference Cache with NVIDIA Triton and Redis
Caching is as fundamental to computing as arrays, symbols, or strings. Various layers of caching throughout the stack hold instructions from memory while...
13 MIN READ

Aug 15, 2023
Customizing AI Models: Train Character Detection and Recognition Models with NVIDIA TAO
Optical Character Detection (OCD) and Optical Character Recognition (OCR) are computer vision techniques used to extract text from images. Use cases vary across...
14 MIN READ

Aug 15, 2023
Customizing AI Models: Deploy a Character Detection and Recognition Model with NVIDIA Triton
NVIDIA Triton Inference Server streamlines and standardizes AI inference by enabling teams to deploy, run, and scale trained ML or DL models from any framework...
4 MIN READ

Jun 28, 2023
How to Deploy an AI Model in Python with PyTriton
AI models are everywhere, in the form of chatbots, classification and summarization tools, image models for segmentation and detection, recommendation models,...
6 MIN READ

May 05, 2023
Why Automatic Augmentation Matters
Deep learning models require hundreds of gigabytes of data to generalize well on unseen samples. Data augmentation helps by increasing the variability of...
13 MIN READ

May 04, 2023
Increasing Throughput and Reducing Costs for AI-Based Computer Vision with CV-CUDA
Real-time cloud-scale applications that involve AI-based computer vision are growing rapidly. The use cases include image understanding, content creation,...
11 MIN READ

Apr 25, 2023
Increasing Inference Acceleration of KoGPT with NVIDIA FasterTransformer
Transformers are one of the most influential AI model architectures today and are shaping the direction of future AI R&D. First invented as a tool for...
6 MIN READ

Mar 29, 2023
Bootstrapping Object Detection Model Training with 3D Synthetic Data
Training AI models requires mountains of data. Acquiring large sets of training data can be difficult, time-consuming, and expensive. Also, the data collected...
12 MIN READ

Mar 23, 2023
Power Your AI Inference with New NVIDIA Triton and NVIDIA TensorRT Features
NVIDIA AI inference software consists of NVIDIA Triton Inference Server, open-source inference serving software, and NVIDIA TensorRT, an SDK for...
5 MIN READ

Mar 22, 2023
SDKs Accelerating Industry 5.0, Data Pipelines, Computational Science, and More Featured at NVIDIA GTC 2023
At NVIDIA GTC 2023, NVIDIA unveiled notable updates to its suite of NVIDIA AI software for developers to accelerate computing. The updates reduce costs in...
10 MIN READ

Mar 13, 2023
Serving ML Model Pipelines on NVIDIA Triton Inference Server with Ensemble Models
In many production-level machine learning (ML) applications, inference is not limited to running a forward pass on a single ML model. Instead, a pipeline of ML...
19 MIN READ

Feb 08, 2023
Speech AI Spotlight: How Pendulum Nabs Harmful Narratives Online
Over 55% of the global population uses social media, easily sharing online content with just one click. While connecting with others and consuming entertaining...
7 MIN READ

Jan 12, 2023
Autoscaling NVIDIA Riva Deployment with Kubernetes for Speech AI in Production
Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process...
13 MIN READ

Dec 19, 2022
Deploying Diverse AI Model Categories from Public Model Zoo Using NVIDIA Triton Inference Server
Nowadays, a huge number of implementations of state-of-the-art (SOTA) models and modeling solutions are present for different frameworks like TensorFlow, ONNX,...
12 MIN READ

Dec 08, 2022
Introducing NVIDIA Riva: A GPU-Accelerated SDK for Developing Speech AI Applications
This post was updated in March 2023. Sign up for the latest Speech AI news from NVIDIA. Speech AI is used in a variety of applications, including contact...
8 MIN READ

Nov 30, 2022
Designing an Optimal AI Inference Pipeline for Autonomous Driving
Self-driving cars must be able to detect objects quickly and accurately to ensure the safety of their drivers and other drivers on the road. Due to this need...
8 MIN READ