Scaling-up PyTorch inference: Serving billions of daily NLP inferences with ONNX Runtime Read the article → Microsoft Open Source Blog Direct Link