[ MANIFEST ]

// 11 entries · spanning 2023 / 2026

  1. 2026 1 entry
  2. [NEW]

    Anatomy of verl, the RL post-training framework I lived in

  3. 2025 3 entries
  4. Install flash-attn without crying while using uv

  5. A story of using langchain/langgraph

  6. Python Project Management and Packaging: PEP 751 update and some of the remaining issues of packaging

  7. 2024 6 entries
  8. A Comprehensive Guide to Python Project Management and Packaging: Concepts Illustrated with uv - Part II

  9. A Comprehensive Guide to Python Project Management and Packaging: Concepts Illustrated with uv - Part I

  10. Deploying a Streamlit app on AWS EC2 (with your own domain name)

  11. Position Information in Transformer-Based Models: Exploring the main Methods and Approaches

    This article explains the main position encoding methods and how they went about making them: - Learnable absolue PE and sinusoidal - Relative PEs: T5, ALiBi, FIRE - Both: RoPE - and no position encoding

  12. Sparse Transformers

    This article delves deep into the Sparse Transformers as introduced in the paper "Generating Long Sequences with Sparse Transformers". The main points of interest are the explanation of the motivation and intuition behind the sparse factorizations, their theory as well as complexity proofs.

  13. Decoder-only Language Models Architecture Evolution (Part I)

    This is the first of a series of articles on the evolution of LLM architectures. This first article dives deep in the first three GPT models.

  14. 2023 1 entry
  15. Transformers: Attention Is All You Need

    Explore the Transformer architecture as presented in the paper 'Attention Is All You Need' by Vaswani et al. (2017). This article offers detailed code implementations and mathematical insights for each component, providing a comprehensive understanding of the model.