Scaling LLM Inference: Innovations in Tensor Parallelism, Context Parallelism, and Expert Parallelism
At Meta, we are constantly pushing the boundaries of LLM inference systems to power applications such as the Meta AI App. We’re sharing how we…
At Meta, we are constantly pushing the boundaries of LLM inference systems to power applications such as the Meta AI App. We’re sharing how we…
Sapling is a scalable, user-friendly, and open-source source control system that powers Meta’s monorepo. As discussed at the GitMerge 2024 conference session on branching, designing…
We’re sharing details on our journey to scale Meta’s Backbone network to support the increasing demands of new and existing AI workloads. We’ve developed new…
The next chapter of real-time analytics at Uber. Uncover how Uber restructured its Apache Pinot™ query architecture to unlock a ton of new features, redefining…
We’re presenting Design for Sustainability, a set of technical design principles for new designs of IT hardware to reduce emissions and cost through reuse, extending…
As we focus on our goal of achieving net zero emissions in 2030, we also aim to create a common taxonomy for the entire industry…
At Open Compute Project Summit (OCP) 2025, we’re sharing details about the direction of next-generation network fabrics for our AI training clusters. We’ve expanded our…
From manual GPU configs to one-command clusters: Join Serrana Aguirregaray and Nathaniel Jenkins as they tell the story of Discord’s journey to build a developer-friendly…
Do you embrace tricks in the Shop or turn them into treats? Trick out your profile with shockingly sweet decorations, or treat your friend to…
Meta open-sourced React over a decade ago to help developers build better user experiences. Since then, React has grown into one of the world’s most…
It’s mid 2023 and we’ve identified some opportunities to improve our reliability. Fast forward to January 2025. Customer impact hours are reduced from the peak…
Check out the finer details of the more technical fixes implemented into Discord recently. This post first appeared on Read More
OpenZL is a new open source data compression framework that offers lossless compression for structured data. OpenZL is designed to offer the performance of a…
Cadence Workflow is now part of the Cloud Native Computing Foundation®. This milestone strengthens our commitment to open source and ensures continued investment in the…
We’re introducing Candle, a new submarine cable connecting countries across East Asia and Southeast Asia. We’re also announcing several updates to our subsea cables across…