公式動画ピックアップ

AAPL   ADBE   ADSK   AIG   AMGN   AMZN   BABA   BAC   BL   BOX   C   CHGG   CLDR   COKE   COUP   CRM   CROX   DDOG   DELL   DIS   DOCU   DOMO   ESTC   F   FIVN   GILD   GRUB   GS   GSK   H   HD   HON   HPE   HSBC   IBM   INST   INTC   INTU   IRBT   JCOM   JNJ   JPM   LLY   LMT   M   MA   MCD   MDB   MGM   MMM   MSFT   MSI   NCR   NEM   NEWR   NFLX   NKE   NOW   NTNX   NVDA   NYT   OKTA   ORCL   PD   PG   PLAN   PS   RHT   RNG   SAP   SBUX   SHOP   SMAR   SPLK   SQ   TDOC   TEAM   TSLA   TWOU   TWTR   TXN   UA   UAL   UL   UTX   V   VEEV   VZ   WDAY   WFC   WK   WMT   WORK   YELP   ZEN   ZM   ZS   ZUO  

  公式動画&関連する動画 [Distributed inference with llm-d’s “well-lit paths”]

Large language models like DeepSeek-R1 need a large amount of parameters to perform complex tasks, creating the need for a distributed hardware system. Such a system requires distributed inference to optimize performance. Enter llm-d, an open source framework for distributed LLM inference. Join Robert Shaw, Red Hat’s Director of Engineering for AI, as he dives into llm-d’s well-lit paths approach—a straightforward and efficient way to manage LLM inference distribution and meet the demands of large-scale AI workloads. 00:00 Introduction 00:43 The Enterprise Generative AI Inference Platform Stack 04:36 The llm-d Architecture Overview 08:39 Introducing Well-Lit Paths 09:54 Intelligent Inference Scheduling: Prefix-Aware & Load-Aware Routing 14:14 P/D Disaggregation: Splitting Prefill and Decode for Efficiency 17:45 Efficient KV Cache Transfer in VLLM with NIXL and RDMA 18:36 Flexible, Configurable Deployments with Heterogeneous Tensor Parallelism 19:32 KV Cache Management 22:58 Mixture of Experts Overview and Model Deployment 24:26 Wide Expert Parallelism (WideEP) Optimizations for MoE Scaling 27:45 Performance Summary and Closing 🔗 Read more about distributed inference: https://www.redhat.com/en/topics/ai/what-is-distributed-inference #AI #RedHat #Kubernetes #llmd
 391      15