Loading…
10-11 June
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon China 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Hong Kong Standard Time (UTC+8:00)To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Tuesday June 10, 2025 16:15 - 16:45 HKT
Managing large-scale LLM inference workloads on Kubernetes requires more than just high-performance inference engines like vLLM. It demands a comprehensive control plane that integrates deeply with engines while addressing the complexities of large-scale operations. This need inspired the creation of AIBrix, a Kubernetes-native control plane designed to scale LLM inference with modularity, flexibility, and cutting-edge algorithms.

AIBrix introduces a pluggable architecture with components for LLM specific autoscaling, high-density lora management, distributed KV cache, heterogenous serving, model loading etc. AIBrix emphasizes deep co-design with inference engines, enabling advanced features and optimizations. This talk will demonstrate AIBrix in action, showcasing its ability to improve scalability and optimize resource utilization. Additionally, we will present detailed benchmarks to evaluate the performance of these components, providing actionable insights for practitioners.
Speakers
avatar for Jiaxin

Jiaxin

Software Engineer, Bytedance
Jiaxin works at ByteDance Infrastructure Lab, focusing on serverless and AI infrastructure. He is also a co-chair of Kubernetes WG-Serving, Jiaxin drives innovations and contributes to the future of scalable AI systems.
avatar for Liguang Xie .

Liguang Xie .

Director of Engineering, ByteDance
Liguang Xie is an Engineering Lead at ByteDance’s Compute Infrastructure Team, leading next-gen serverless infrastructure design and overseeing open-source, research, and engineering efforts. He has extensive experience in large-scale distributed systems, AI/ML platforms, and LLM/GNN... Read More →
Tuesday June 10, 2025 16:15 - 16:45 HKT
Level 19 | Crystal Court I
  AI + ML

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link