Distributed inference with llm-d’s “well-lit paths” 米国企業公式動画まとめ | 米国株を知る場所 nasdaqchart（ナスダックチャート）

公式動画ピックアップ

AAPL ADBE ADSK AIG AMGN AMZN BABA BAC BL BOX C CHGG CLDR COKE COUP CRM CROX DDOG DELL DIS DOCU DOMO ESTC F FIVN GILD GRUB GS GSK H HD HON HPE HSBC IBM INST INTC INTU IRBT JCOM JNJ JPM LLY LMT M MA MCD MDB MGM MMM MSFT MSI NCR NEM NEWR NFLX NKE NOW NTNX NVDA NYT OKTA ORCL PD PG PLAN PS RHT RNG SAP SBUX SHOP SMAR SPLK SQ TDOC TEAM TSLA TWOU TWTR TXN UA UAL UL UTX V VEEV VZ WDAY WFC WK WMT WORK YELP ZEN ZM ZS ZUO

公式動画＆関連する動画 [Distributed inference with llm-d’s “well-lit paths”]

RHT

Large language models like DeepSeek-R1 need a large amount of parameters to perform complex tasks, creating the need for a distributed hardware system. Such a system requires distributed inference to optimize performance. Enter llm-d, an open source framework for distributed LLM inference. 

Join Robert Shaw, Red Hat’s Director of Engineering for AI, as he dives into llm-d’s well-lit paths approach—a straightforward and efficient way to manage LLM inference distribution and meet the demands of large-scale AI workloads.

00 Introduction
43 The Enterprise Generative AI Inference Platform Stack 
36 The llm-d Architecture Overview
39 Introducing Well-Lit Paths
54 Intelligent Inference Scheduling: Prefix-Aware & Load-Aware Routing
14 P/D Disaggregation: Splitting Prefill and Decode for Efficiency
45 Efficient KV Cache Transfer in VLLM with NIXL and RDMA
36 Flexible, Configurable Deployments with Heterogeneous Tensor Parallelism
32 KV Cache Management
58 Mixture of Experts Overview and Model Deployment
26 Wide Expert Parallelism (WideEP) Optimizations for MoE Scaling
45 Performance Summary and Closing

🔗 Read more about distributed inference: https://www.redhat.com/en/topics/ai/what-is-distributed-inference 

#AI #RedHat #Kubernetes #llmd 

391 15

この動画に関連する企業の動画一覧はこちら