• Skip to primary navigation
  • Skip to content
  • Skip to footer
Jules Belveze
  • Home
  • Blog
  • Resources

Resources

  • DeeBERT: Teaching BERT When to Stop Thinking

    Why does BERT need twelve layers to classify “I love this movie” as positive?

  • Early Exiting: The Under-Hyped Compression Method

    Why are we burning GPU hours to answer “2 + 2 = 4”?

  • Instruction Fine-Tuning Fundamentals

    Instruction Fine-Tuning (IFT) is the secret sauce that transforms generic language models into obedient AI assistants.

  • Scaling Machine Learning Experiments With neptune.ai and Kubernetes

    Link post: managing and scaling experiment tracking with Neptune and Kubernetes.

  • Case Study: MLOps for NLP-powered Media Intelligence using Metaflow

    Link post: case study on building NLP media intelligence with Metaflow.

  • Scaling-up PyTorch inference: Serving billions of daily NLP inferences with ONNX Runtime

    Link post: engineering write-up on large-scale PyTorch inference with ONNX Runtime.

  • Atlastic Reputation AI: Four Years of Advancing and Applying a SOTA NLP Classifier

    Link post: paper on advancing and applying a SOTA NLP classifier.

© 2025 Jules Belveze