tpu
an archive of posts with this tag
| Mar 29, 2026 | Scaling LLMs: MoE Routing & JAX Parallelism on TPU |
|---|---|
| Mar 14, 2026 | TPU Profiling: When Math Meets Reality |
| Mar 08, 2026 | Serving LLaMA 3-70B: From Theory to Production Numbers |
| Mar 07, 2026 | Transformer Inference: Two Problems in Disguise |
| Mar 02, 2026 | Training LLaMA 3 on TPUs: Putting Theory Into Practice |
| Feb 10, 2026 | Training at Scale: When Communication Becomes the Enemy |
| Feb 07, 2026 | Transformer Math: The 6PT Rule and Other Accounting Tricks |
| Feb 07, 2026 | Sharding Strategies: The Art of Distributed Matrix Multiplication |
| Feb 04, 2026 | TPU Architecture: Understanding the Bandwidth Hierarchy |
| Feb 03, 2026 | Roofline Analysis: When Does Your Model Hit the Wall? |
| Feb 02, 2026 | Scaling LLMs: From Alchemy to Science (Part 0) |