公式動画ピックアップ

AAPL   ADBE   ADSK   AIG   AMGN   AMZN   BABA   BAC   BL   BOX   C   CHGG   CLDR   COKE   COUP   CRM   CROX   DDOG   DELL   DIS   DOCU   DOMO   ESTC   F   FIVN   GILD   GRUB   GS   GSK   H   HD   HON   HPE   HSBC   IBM   INST   INTC   INTU   IRBT   JCOM   JNJ   JPM   LLY   LMT   M   MA   MCD   MDB   MGM   MMM   MSFT   MSI   NCR   NEM   NEWR   NFLX   NKE   NOW   NTNX   NVDA   NYT   OKTA   ORCL   PD   PG   PLAN   PS   RHT   RNG   SAP   SBUX   SHOP   SMAR   SPLK   SQ   TDOC   TEAM   TSLA   TWOU   TWTR   TXN   UA   UAL   UL   UTX   V   VEEV   VZ   WDAY   WFC   WK   WMT   WORK   YELP   ZEN   ZM   ZS   ZUO  

  公式動画&関連する動画 [[vLLM Office Hours #41] LLM Compressor Update & Case Study - January 22, 2026]

In this vLLM office hours session, we shared the latest updates from across the vLLM ecosystem and took a deep dive into recent advances in model quantization with LLM Compressor. We kicked things off with our regular bi-weekly vLLM project update from core committer Michael Goin, covering recent vLLM-Omni model releases, the vLLM v0.14.0 release, and the availability of official ROCm wheels and container images. We then welcomed the Red Hat AI team for an in-depth look at LLM Compressor 0.9.0, including new attention and KV cache quantization capabilities, model-free PTQ for FP8, AutoRound, experimental MXFP4 support, and performance improvements such as batched calibration and expanded AWQ support. To close out the session, Cohere shared a real-world case study on how they use LLM Compressor for model quantization in production, discussing practical considerations, trade-offs, and lessons learned at scale. Timestamps: 00:00 Intro and agenda 07:47 vLLM v0.14.0 update 20:37 LLM Compressor 0.9.0 deep dive 40:40 Cohere quantization case study 48:30 Q&A and discussion Session slides: https://docs.google.com/presentation/d/1_r50rmRcB6PY2hBEFk5lqJfgtnb3NQzMHTmNjClQHi0 Explore and join our bi-weekly vLLM office hours: https://red.ht/office-hours
 505      13