December 31, 2020
This article is a deep dive into the techniques needed to get SSD300 object detection throughput to 2530 FPS. We will rewrite Pytorch model code, perform ONNX graph surgery, optimize a TensorRT plugin and finally we’ll quantize the model to an 8-bit representation. We will also examine divergence from the accuracy of the full-precision model.
October 29, 2020
TorchScript is one of the most important parts of the Pytorch ecosystem, allowing portable, efficient and nearly seamless deployment. With just a few lines of torch.jit
code and some simple model changes you can export an asset that runs anywhere libtorch
does. It’s an important toolset to master if you want to run your models outside the lab at high efficiency. This article is a collection of topics going beyond the basics of your first export.
October 17, 2020
In this article we take performance of the SSD300 model even further, leaving Python behind and moving towards true production deployment technologies: TorchScript, TensorRT and DeepStream. We also identify and understand several limitations in Nvidia’s DeepStream framework, and then remove them by modifying how the nvinfer
element works.