The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon China 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.
Please note: This schedule is automatically displayed in Hong Kong Standard Time (UTC+8:00). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.
Sign up or log in to add sessions to your schedule and sync them to your phone or calendar.
With the development of AI technology, the demand for computing power for large model training has accelerated the deployment of AI infrastructure. Data centers often have a "resource wall" problem between AI acceleration hardware of different generations and manufacturers, which caused the incompatibility issue of software and hardware stack. Thus, it’s a big challenge for AI infra operators to maximize resource utilization. This topic focuses on technical solutions for collaborative training using chips of different architectures, sharing the practices on solving key problems such as heterogeneous training task splitting, heterogeneous training performance prediction, and heterogeneous hybrid communication and etc.. The project has been open sourced and will be further improved with better maturity through the community.