Object Detection at 2530 FPS with TensorRT and 8-Bit Quantization
December 31, 2020
This article is a deep dive into the techniques needed to get SSD300 object detection throughput to 2530 FPS. We will rewrite Pytorch model code, perform ONNX graph surgery, optimize a TensorRT plugin and finally we’ll quantize the model to an 8-bit representation. We will also examine divergence from the accuracy of the full-precision model.