bertsqueeze#

https://img.shields.io/github/stars/JulesBelveze/bert-squeeze.svg?style=social&label=Star&maxAge=2500

Project#

Bert-squeeze is a repository aiming to provide code to reduce the size of Transformer-based models or decrease their latency at inference time.

It gathers a non-exhaustive list of techniques such as distillation , pruning, quantization, early-exiting, … The repo is built using PyTorch Lightning and Transformers.

Content#

Getting started

Bertsqueeze

API References

API
- Assistants
- Models
- Distillation
- Data
- Inference
- Utilities
GitHub Repository

Indices and tables#