KubeCon + CloudNativeCon China 2025: Full Schedule

10-11 June
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon China 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Hong Kong Standard Time (UTC+8:00). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

arrow_back View All Dates

07:30 HKT

Badge Pick-Up

Wednesday June 11, 2025 07:30 - 16:30 HKT

Level 16 | Grand Ballroom Pre-Function Area

Wednesday June 11, 2025 07:30 - 16:30 HKT
Level 16 | Grand Ballroom Pre-Function Area

Registration

07:30 HKT

Cloakroom

Wednesday June 11, 2025 07:30 - 17:15 HKT

Level 19 | Crystal Court Foyer

Location:
Level 19 | Crystal Court Foyer

Please note we are unable to store any items overnight and cameras, laptop equipment or any other electronic devices cannot be stored in the cloakroom at any time.

Wednesday June 11, 2025 07:30 - 17:15 HKT
Level 19 | Crystal Court Foyer

Registration, Cloakroom

09:00 HKT

Keynote: Welcome Back + Opening Remarks - Keith Chan, Director of Strategic Planning, The Linux Foundation APAC

Wednesday June 11, 2025 09:00 - 09:10 HKT

Level 16 | Grand Ballroom I

Speakers

Keith Chan

Director of Strategic Planning, The Linux Foundation APAC

Wednesday June 11, 2025 09:00 - 09:10 HKT
Level 16 | Grand Ballroom I

Keynote Sessions

Content Experience Level Intermediate
Presentation Language English

09:12 HKT

Keynote: Optimizing AI Workload Scheduling: Bilibili's Journey To an Efficient Cloud Native AI Platform - Long Xu, Bilibili & Kevin Wang, Huawei

Wednesday June 11, 2025 09:12 - 09:22 HKT

Level 16 | Grand Ballroom I

As China's leading video platform, Bilibili faces 4 key challenges in multi-cluster AI workloads management:
1. Workload Diversity: Training/inference/video processing workloads have different scheduling requirements.
2. Cross-Cluster Complexity: Managing workloads across multiple Kubernetes clusters in expanding IDCs with SLAs.
3. Performance Demands: Minimal startup latency and best scheduling efficiency for short-running tasks e.g. video processing.
4. Efficiency-QoS Balance: maximizing resource utilization while ensuring priority workload stability.

This talk will share experiences and delve specific optimization techniques:
1. Leveraging and optimizing CNCF projects such as Karmada and Volcano to build a unified, high-performance AI workload scheduling platform.
2. Integrating technologies such as KubeRay to schedule various AI online and offline workloads.
3. Maximizing resource efficiency through online and offline hybrid scheduling, tidal scheduling and other technologies.

Speakers

Kevin Wang

Technical Expert, Lead of Cloud Native Open Source, Huawei

Kevin Wang has been an outstanding contributor in the CNCF community since its beginning and is the leader of the cloud native open source team at Huawei. Kevin has contributed critical enhancements to Kubernetes, led the incubation of the KubeEdge, Volcano, Karmada projects in CNCF... Read More →

Long Xu

Senior Software Engineer, Bilibili

Long Xu is a Senior Software Engineer in the Infrastructure Department at Bilibili. He has rich experiences in the Kubernetes field, including scheduling, autoscaling and system stability.

Wednesday June 11, 2025 09:12 - 09:22 HKT
Level 16 | Grand Ballroom I

Keynote Sessions, AI + ML

Content Experience Level Any
Presentation Language Chinese

09:24 HKT

Keynote: Key Cloud Native Technologies in its Next Decade - Lin Sun, Head of Open Source, Solo.io

Wednesday June 11, 2025 09:24 - 09:34 HKT

Level 16 | Grand Ballroom I

When we started CNCF in 2015 to help advance container technology, Kubernetes was the seeding technology to provide a de facto container orchestration platform for all cloud native applications. Almost a decade later, the community has exploded with 200+ open source projects building on top of cloud native technologies. Looking ahead, what challenges will we have in the next decade? What gaps remain for users and contributors? And how do we evolve to meet the demands of an increasingly complex and connected world?

Let us review some of the key CNCF projects today and lay out some possible avenues for where cloud native is going for the next decade, AI, agentic network, sustainability and beyond.

Speakers

Lin Sun

Head of Open Source & CNCF TOC, Solo.io

Lin is the Head of Open Source at Solo.io, and a CNCF TOC member and ambassador. She has worked on the Istio service mesh since the beginning of the project in 2017 and serves on the Istio Steering Committee and Technical Oversight Committee. Previously, she was a Senior Technical... Read More →

Wednesday June 11, 2025 09:24 - 09:34 HKT
Level 16 | Grand Ballroom I

Keynote Sessions

Content Experience Level Any
Presentation Language English

09:36 HKT

Keynote: Who Owns Your Pod? Observing and Blocking Unwanted Behavior at eBay With eBPF - Jianlin Lv, eBay & Liyi Huang, Isovalent at Cisco

Wednesday June 11, 2025 09:36 - 09:46 HKT

Level 16 | Grand Ballroom I

Kubernetes admins often struggle to understand pod activities, both for regular pods and those with various privileges. This session explores two use cases that highlight why Tetragon, an eBPF-based observability and enforcement tool, for pod security:
1.Replacing Auditbeat with Tetragon: Learn how Auditbeat rules mapped to Tetragon tracing policies, identifying functionality gaps, and how eBay contributed back to the community
2.Auditing Container Process Permissions: See how Tetragon helped analyze pod behavior and determine if applications could migrate to more restrictive pod security policies, ensuring adherence to the principle of least privilege
We also cover deployment challenges, such as integrating with SIEM platforms, resource utilization, and implementing runtime enforcement for unwanted pod behavior. This talk provides practical insights into using Tetragon for observability, policy refinement, and improving overall pod security posture in Kubernetes environments.

Speakers

Jianlin Lv

Senior Linux Kernel Development Engineer, eBay

https://www.linkedin.com/in/jianlin-lv-25650141/

Liyi Huang

customer success architect, Isovalent at Cisco

senior solution architect @isovalent.com

Wednesday June 11, 2025 09:36 - 09:46 HKT
Level 16 | Grand Ballroom I

Keynote Sessions, Observability

Content Experience Level Intermediate
Presentation Language Chinese

09:48 HKT

Keynote: How We Save $900 per Day with Self-Hosted AI: Building Scalable Local LLM Infrastructure - Vivian Hu, Product Manager, Second State & Lv Yi, CTO, 5miles

Wednesday June 11, 2025 09:48 - 09:58 HKT

Level 16 | Grand Ballroom I

While SaaS AI providers like OpenAI offer convenient LLM services, they come with significant drawbacks: high costs, lack of customization, lack of privacy, and usage limitations that can throttle high-volume applications.

This presentation shows how a leading e-commerce web site deployed a highly customized suite of LLM applications on private cloud infra, reducing costs by 90% while maintaining complete control over scalability and quality of service. We'll discuss the technology stack for orchestrating inference workloads on cloud GPUs, and explore practical strategies for building stable, scalable, high-performance AI apps on your own private cloud infra.

Speakers

Lv Yi

CTO, 5miles

Lv Yi is the CTO of 5miles, a leading e-commerce platform in the United States. With 19 years in IT, he is a cloud native enthusiast who previously served as a mobile business expert at AsiaInfo. In 2012, he led Zhangyue's systems evolution toward microservices architecture. At 5miles... Read More →

Vivian Hu

Product Manager, Second State

Vivian Hu is a Product Manager at Second State and a columnist at InfoQ. She is a founding member of the WasmEdge project. She organizes Rust and WebAssembly community events in Asia.

Wednesday June 11, 2025 09:48 - 09:58 HKT
Level 16 | Grand Ballroom I

Keynote Sessions

Presentation Language Chinese

10:00 HKT

Keynote: Building a Large Model Inference Platform for Heterogeneous Chinese Chips Based on VLLM - Haiwen Zhang, China Mobile & Kante Yin, DaoCloud

Wednesday June 11, 2025 10:00 - 10:10 HKT

Level 16 | Grand Ballroom I

With the growing demand for heterogeneous computing power, Chinese users are gradually adopting domestic GPUs, especially for inference. vLLM, the most popular open-source inference project, has drawn widespread attention but does not support domestic chips.Chinese inference engines are still developing in functionality, performance, and ecosystem. In this session, we’ll introduce how to adapt vLLM to support domestic GPUs,enabling acceleration features like PageAttention, Continuous Batching, and Chunked Prefill. We’ll also cover performance bottleneck analysis and chip operator development to maximize hardware potential.
Additionally, Kubernetes has become the standard for container orchestration and is the preferred platform for inference services. We’ll show how to deploy the adapted vLLM engine on Kubernetes using the open-source llmaz project with a few lines of code, and explore how llmaz handles heterogeneous GPU scheduling and our practices for monitoring and elastic scaling.

Speakers

Haiwen Zhang

Senior Software Engineer, China Mobile (Suzhou) Software Technology Co., Ltd.

The author has rich experience in cloud-native and AI inference development, currently works at China Mobile, focusing on the research and development of cloud-native and AI inference related products. He shared experiences of service mesh at some technical conferences such as the... Read More →

Kante Yin

Software Engineer, DaoCloud

Kante is a senior software engineer and an open source enthusiast from DaoCloud, his work is mostly around scheduling, resource management and LLM inference. He actively contributes to upstream Kubernetes as SIG-Scheduling Maintainer and helps in incubating several projects like Kueue... Read More →

Wednesday June 11, 2025 10:00 - 10:10 HKT
Level 16 | Grand Ballroom I

Keynote Sessions, AI + ML

Content Experience Level Any
Presentation Language Chinese

10:10 HKT

Keynote: Closing Remarks

Wednesday June 11, 2025 10:10 - 10:15 HKT

Level 16 | Grand Ballroom I

Wednesday June 11, 2025 10:10 - 10:15 HKT
Level 16 | Grand Ballroom I

Keynote Sessions, Platform Engineering

Content Experience Level Intermediate
Presentation Language English

10:15 HKT

Gold Sponsor In-Booth Demos

Wednesday June 11, 2025 10:15 - 10:45 HKT

Level 16 | Grand Ballroom II

Sponsor: Akamai
Demo: Unleash AI apps with edge-native speed on Akamai Cloud
Booth Number: G5

In order to facilitate networking and business relationships at the event, you may choose to visit a third party’s booth or access sponsored content. You are never required to visit third party booths or to access sponsored content. When visiting a booth or participating in sponsored activities, the third party will receive some of your registration data. This data includes your first name, last name, title, company, address, email, standard demographics questions (i.e. job function, industry), and details about the sponsored content or resources you interacted with. If you choose to interact with a booth or access sponsored content, you are explicitly consenting to receipt and use of such data by the third-party recipients, which will be subject to their own privacy policies.

Wednesday June 11, 2025 10:15 - 10:45 HKT
Level 16 | Grand Ballroom II

Sponsored Demos, Gold Sponsor In-Booth Demos

10:15 HKT

Coffee Break ☕

Wednesday June 11, 2025 10:15 - 11:00 HKT

Level 16 | Grand Ballroom II

Wednesday June 11, 2025 10:15 - 11:00 HKT
Level 16 | Grand Ballroom II

Breaks

10:15 HKT

Project Pavilion Tables | Wednesday Morning

Wednesday June 11, 2025 10:15 - 12:30 HKT

Level 16 | Grand Ballroom II

Cilium P-2
Hami P-6
Karpenter P-1
Kmesh P-7
Kubespray P-3
Kyverno P-4
Litmus P-5
Open Cluster Management P-8

Wednesday June 11, 2025 10:15 - 12:30 HKT
Level 16 | Grand Ballroom II

Project Opportunities

10:15 HKT

Solutions Showcase

Wednesday June 11, 2025 10:15 - 15:30 HKT

Level 16 | Grand Ballroom II

Whether you’re looking to expand your knowledge, connect with experts, or just enjoy a break, the Solutions Showcase is the place to be:

- Exhibits: Visit our sponsor booths to learn about the latest technologies and services.
- CNCF Project Tables: Interact with project maintainers and gain insights into community engagement.
- Attendee T-Shirt Pick-up: Grab your free conference t-shirt.
- Coffee + Tea, Snacks, Lunch Pick-up: Enjoy delicious treats served in the Solutions Showcase.

In order to facilitate networking and business relationships at the event, you may choose to visit a third party’s booth or to access sponsored content. You are never required to visit third-party booths or to access sponsored content. When visiting a booth or participating in sponsored activities, the third party will receive some of your registration data. This data includes your first name, last name, title, company, address, email, standard demographics questions (i.e. job function, industry), and details about the sponsored content or resources you interacted with. If you choose to interact with a booth or access sponsored content, you are explicitly consenting to receipt and use of such data by the third-party recipients, which will be subject to their own privacy policies.

Wednesday June 11, 2025 10:15 - 15:30 HKT
Level 16 | Grand Ballroom II

Solutions Showcase

11:00 HKT

Destroy Your System To Make It More Reliable: An Easy Way To Start Chaos Engineering With LitmusChao - Sayan Mondal, Harness

Wednesday June 11, 2025 11:00 - 11:30 HKT

Level 21 | Pearl Pavilion

Imagine your cloud-native applications as a bustling city. To ensure everything runs smoothly, you need to test its resilience by introducing controlled chaos, like planned roadblocks, to spot and fix weaknesses before they cause real trouble.

Join the LitmusChaos team, the folks behind this CNCF Incubating project, as they share the latest and greatest in chaos engineering. They'll walk you through new features from recent updates, like better resilience testing, improved observability, and scalability tools, all designed to tackle the real-world problems developers and SREs face daily.

You'll also get the inside scoop on the project's growth, how the community is shaping its future, and a sneak peek at what's coming next to make chaos engineering easier and more effective.

Speakers

Sayan Mondal

Senior Software Engineer II, Harness

Sayan Mondal is a Senior Software Engineer II at Harness, building their Chaos Engineering platform and helping them shape the customer experience market. He's the maintainer of a few open-source libraries and is also a maintainer and community manager of LitmusChaos (the Incubating... Read More →

Wednesday June 11, 2025 11:00 - 11:30 HKT
Level 21 | Pearl Pavilion

Maintainer Track

11:00 HKT

Unified Observability in GRPC: Metrics and Tracing Using OpenTelemetry Plugin - Purnesh Dixit, Google

Wednesday June 11, 2025 11:00 - 11:30 HKT

Level 16 | Grand Ballroom I

gRPC’s performance advantages hinge on minimizing latency, but its binary protocol and streaming capabilities make debugging and monitoring inherently opaque. While distributed tracing identifies bottlenecks, metrics like error rates and throughput are critical for holistic insights. Yet, manual instrumentation for these signals in gRPC is complex, error-prone, and lacks standardization.

In this talk, Purnesh Dixit from the gRPC team unveils the new OpenTelemetry plugin for gRPC, developed by the gRPC team at Google, which provides unified metrics and tracing out-of-the-box to monitor retries, diagnose streaming bottlenecks, and optimize performance without invasive code changes.
1) Client-per-call: Track overall RPC lifecycle (e.g., grpc.client.call.duration).

2) Client-per-call-attempt: Analyze individual retries/hedges (e.g., grpc.client.attempt.duration).

3) Server-instruments: Measure concurrency, request queuing, and stream lifetimes (e.g., grpc.server.call.started).

Speakers

Purnesh Dixit

Purnesh Dixit (gRPC Team, Google), Google

Purnesh is a software engineer on the gRPC team at Google. He is a contributor to the OpenTelemetry and xDS support in gRPC-go.

Unified Observability in GRPC Metrics and Tracing using OpenTelemetry Plugin pdf

Wednesday June 11, 2025 11:00 - 11:30 HKT
Level 16 | Grand Ballroom I

Observability

Content Experience Level Intermediate
Presentation Language English

11:00 HKT

Resilient Multiregion Global Control Planes With Crossplane and K8gb - Yury Tsarev & Steven Borrelli, Upbound

Wednesday June 11, 2025 11:00 - 11:30 HKT

Level 19 | Crystal Court I

Ensuring resilience in control planes is critical for organizations managing infrastructure and applications across multiple regions with Kubernetes. This talk presents a reference architecture for creating a Crossplane-based Global Control Plane, enhanced with k8gb for DNS-based failover and leveraging an Active/Passive setup.
We’ll explore how Crossplane’s declarative infrastructure provisioning integrates with k8gb to build robust, scalable, and resilient multicluster environments. Key takeaways include:

- Architecting resilient multiregion control planes with Active/Passive roles
- Demonstrating failover mechanisms where the Passive control plane transitions to Active during failures
- Strategies for optimizing failover times while maintaining availability

This session will guide attendees through proven methods and real-world challenges of building resilient Global Control Planes, empowering them to manage critical workloads across geographically distributed regions confidently.

Speakers

Steven Borrelli

Principal Soutions Architect, Upbound

Steven is a Principal Solutions Architect for Upbound, where he helps customers adopt Crossplane.

Yury Tsarev

Principal Solutions Architect, Upbound

Yury is an experienced software engineer who strongly focuses on open-source, software quality and distributed systems. As the creator of k8gb (https://www.k8gb.io) and active contributor to the Crossplane ecosystem, he frequently speaks at conferences covering topics such as Control... Read More →

Wednesday June 11, 2025 11:00 - 11:30 HKT
Level 19 | Crystal Court I

Operations + Performance

Content Experience Level Intermediate
Presentation Language English

11:00 HKT

Peer Group Mentoring

Wednesday June 11, 2025 11:00 - 12:00 HKT

Level 20 | Salon 4

Peer Group Mentoring allows participants to meet with experienced open source veterans across many CNCF projects. Mentees are paired with 2 – 10 other people in a pod-like setting to explore technical, community, and career questions together.

Sign-up to be a Mentee

Sign-up to be a Mentor

Wednesday June 11, 2025 11:00 - 12:00 HKT
Level 20 | Salon 4

Inclusion + Accessibility

Content Experience Level Any
Presentation Language English

11:45 HKT

How Bloomberg Creates a Resilient Data Analytics Platform Using Karmada - Michas Szacillo & Ilan Filonenko, Bloomberg

Wednesday June 11, 2025 11:45 - 12:15 HKT

Level 19 | Crystal Court II

Bloomberg’s Data Analytics Platform Engineering team supports a wide-range of real-time streaming, large batch ETL, and data exploration use-cases by using Apache Flink, Apache Spark, and Trino across multi-cluster Kubernetes. However, deploying and managing these workflows at scale efficiently can be challenging due to varying resource requirements and uptime needs. For stateful applications like Apache Flink, ensuring recovery and state conservation after downtime is especially important.

This session will discuss how Bloomberg uses Karmada, a multi-cluster management system, to deploy and manage Apache Flink. We’ll also explore how Karmada’s capabilities can be expanded to handle additional data analytics workloads, including Apache Spark and Trino. The session will cover the unique requirements and real-life use-cases for each, including:

- Resource-aware workload scheduling
- Custom resource requirements and health interpretation
- State conservation during application failover

Speakers

Ilan Filonenko

Engineering Group Lead, Bloomberg

Ilan Filonenko is an Engineering Group Lead focusing on Cloud Native Data Analytics Infrastructure at Bloomberg - where he has designed and implemented distributed systems at both the application and infrastructure level. Previously, Ilan was an engineering consultant and technical... Read More →

Michas Szacillo

Tech Lead, Bloomberg L.P.

Michas is a senior software engineer and tech lead on Bloomberg’s Streaming Analytics engineering team. The platform, which is running on Kubernetes, serves as the foundation for many of Bloomberg's data streaming use cases. Michas is also a frequent collaborator to the CNCF community... Read More →

How Bloomberg Creates a Resilient Data Platform pdf

Wednesday June 11, 2025 11:45 - 12:15 HKT
Level 19 | Crystal Court II

Data Processing + Storage

Content Experience Level Intermediate
Presentation Language English

11:45 HKT

KubeEdge DeepDive: Architecture, Use Cases, and Project Graduation Updates - Yue Bao, Huawei Cloud Computing Technology & Hongbing Zhang, DaoCloud

Wednesday June 11, 2025 11:45 - 12:15 HKT

Level 21 | Pearl Pavilion

In this session, KubeEdge project maintainers will provide an overview of KubeEdge's architecture and its industry-specific use cases. The session will begin with a brief introduction to edge computing and its growing importance in IoT and distributed systems. The maintainers will then delve into the core components and architecture of KubeEdge, demonstrating how it extends Kubernetes' capabilities to manage edge computing workloads efficiently. They will share success stories and insights from organizations that have deployed KubeEdge in various edge environments, such as smart cities, industrial IoT, edge AI, robotics, and retail, highlighting the tangible benefits and transformational possibilities. Additionally, the session will introduce the certified KubeEdge conformance test, hardware test, KubeEdge course and certification, discuss advancements in technology and community governance within the KubeEdge project, and share the latest updates on the project's graduation status.

Speakers

Yue Bao

Senior Software Engineer, Huawei Cloud Computing Technology Co., Ltd.

Yue Bao serves as a software engineer of Huawei Cloud. She is now working 100% on open source, focusing on lightweight edge for KubeEdge. She is the maintainer of KubeEgde and also the tech leader of KubeEdge SIG Release and Node. Before that, Yue worked on Huawei Cloud Intelligent... Read More →

Hongbing Zhang

KubeEdge TSC Member, Chief Operating Officer, DaoCloud

Hongbing Zhang is Chief Operating Officer of DaoCloud. He is a veteran in open source areas, he founded IBM China Linux team in 2011 and organized team to make significant contributions in Linux Kernel/openstack/hadoop projects. Now he is focusing on cloud native domain and leading... Read More →

Wednesday June 11, 2025 11:45 - 12:15 HKT
Level 21 | Pearl Pavilion

Maintainer Track

11:45 HKT

China Mobile's Panji Platform: Observability Practices and Implementations for LLM Applications Base - Jing Shang, China Mobile & Casey Li, Yunshan Networks, Inc.

Wednesday June 11, 2025 11:45 - 12:15 HKT

Level 16 | Grand Ballroom I

As large language model (LLM) applications are widely deployed, their complex architectures challenge business observability. APM probes, which rely on instrumentation or proxy operation, consume system resources and impact traffic and performance, restricting their use in complex scenarios. Also, multiple teams handling different LLM instances make it hard to coordinate unified observability construction.
To solve this, China Mobile‘'s Panji platform collaborates with DeepFlow to achieve zero-intrusion (Zero Code) and full-stack (Full Stack) observability instantly, using eBPF and Wasm technologies. eBPF collects real-time data at the kernel level, while Wasm plugins parse streaming requests. By integrating existing data, the platform provides service universal map, distributed tracing, and multi-dimensional metric analysis, ensuring the stability and performance optimization of LLM applications.

Speakers

Jing Shang

Chief Expert of China Mobile Group, China Mobile

Dr. Shang Jing, Chief Expert at China Mobile Group, has over 20 years of experience in IT system development, construction, and operation. Specializing in big data and cloud technologies, she led the development of China Mobile's Wutong Big Data Platform. Under her leadership, the... Read More →

Casey Li

Product Manager, Yunshan Networks, Inc.

Starting from graduate school at Huazhong University of Science and Technology in 2013, I joined Tencent Cloud virtual network team in 2016, which provided me with in-depth theoretical knowledge and practical experience in cloud networks. In 2018, I joined YUNSHAN Networks as PM... Read More →

中国移动磐基平台 LLM 应用的可观测性实践 pdf

Wednesday June 11, 2025 11:45 - 12:15 HKT
Level 16 | Grand Ballroom I

Observability

Content Experience Level Advanced
Presentation Language Chinese

11:45 HKT

Kube Intelligence - A Metric Based Insightful Remediation Recommender - Yash Bhatnagar, Google

Wednesday June 11, 2025 11:45 - 12:15 HKT

Level 19 | Crystal Court I

Not everything can be thought about while designing or developing the applications, and as such lot of the design decisions are based on estimates and potential usage patterns.

More often that not, these estimates differ from reality and introduce inefficiencies in the system across several fronts - and if at all visible, it always much later in the lifecycle when you already have several customers & high footprint.

And hence, unless there is a clear sign of performance degradation or unjustified costs, there is often no incentive to invest time & effort for some unknown gains.

In this session Yash will outline a real world case study about how they went about building an internal platform for handling several aspects of post deployment challenges like

1. rightsizing opportunities,
2. architecture migrations like moving to serverless,
3. finding right maintenance windows, etc

by using a wide range of metrics, and how impactful these minor optimizations turned out to be.

Speakers

Yash Bhatnagar

Software Engineer, Google

Yash is working with Google as Software Engineer, and has 9 years of industrial experience with cloud architectures and micro-service development across Google and VMware. He has been a speaker at several international conferences such as KubeCon + CloudNativeCon and Open Source... Read More →

Wednesday June 11, 2025 11:45 - 12:15 HKT
Level 19 | Crystal Court I

Platform Engineering

Content Experience Level Any
Presentation Language English

12:15 HKT

Lunch 🍲

Wednesday June 11, 2025 12:15 - 13:45 HKT

Level 16 | Grand Ballroom II

Wednesday June 11, 2025 12:15 - 13:45 HKT
Level 16 | Grand Ballroom II

Breaks

12:45 HKT

Project Pavilion Tables | Wednesday Afternoon

Wednesday June 11, 2025 12:45 - 15:30 HKT

Level 16 | Grand Ballroom II

Argo P-3
Cilium P-2
Cloud Native Buildpacks P-8
Dragonfly P-7
Fluid P-5
Karpenter P-1
OpenGemini P-6
OpenTelemetry P-4

Wednesday June 11, 2025 12:45 - 15:30 HKT
Level 16 | Grand Ballroom II

Project Opportunities

13:45 HKT

OPEA: The Key Open Platform for Enabling Enterprise AI Deployment - Kenny Chen, Intel Corporation, COIA

Wednesday June 11, 2025 13:45 - 14:15 HKT

Level 21 | Emerald Pavilion

In today's tech landscape, AI drives industry transformation, but enterprises face challenges in AI adoption—diverse hardware, complex workflows, data privacy. OPEA, an open-source enterprise AI platform with modular microservices, offers unified solutions for rapid deployment. Through DeepSeek inference appliance case, see how OPEA integrates with IT infrastructure, optimizes performance, and enhances reliability. Discover the new "Powered by OPEA" certification for confident AI deployment.

Speakers

Kenny Chen

Head of China Open Source Software Ecosystem, Deputy Secretary-General of COIA, Intel Corporation, COIA

Wednesday June 11, 2025 13:45 - 14:15 HKT
Level 21 | Emerald Pavilion

Building the Future of Industrial AI with Open Source (By COIA)

13:45 HKT

Solidigm CSAL Solution Brings Advanced IO Shaping, Caching and Data Placement Into NVIDIA DPU DOCA S - Wayne Gao, Solidigm & Long Chen, NVIDIA

Wednesday June 11, 2025 13:45 - 14:15 HKT

Level 19 | Crystal Court II

CSAL is Cloud Storage Acceleration Layer for BigData and AI. it is open-source user mode FTL, cache and io trace component inside SPDK(upstreamed). It commercially helps Alibaba cloud storage system.
refer https://www.solidigm.com/products/technology/cloud-storage-acceleration-layer-write-shaping-csal.html. Alibaba and Solidigm joint top computer conference paper Eurosys2024 https://dl.acm.org/doi/pdf/10.1145/3627703.3629566
Session Topics:
This session is joint development with NVIDIA DPU team and BeeGFS
1. CSAL leverage DPU DRAM as CSAL write buffer who achieve best storage latency ever also promise the data consistency.
2. QLC high density storage is favorable by AI industry since it save power and space for AI Data Center. DPU storage solution can achieve same thing, it is great combine two things together.
3. CSAL bring advanced storage IO shaping, caching and data placement SW into NVIDIA DPU DOCA storage SW service,
4. DPU and CSAL and BeeGFS experiment data sharing and report

Speakers

Long Chen

Director, NVIDIA

Take charge of promoting NVIDIA networking for high speed storage and new application market in China

Wayne Gao

Princinple storage solution architect, Solidigm

Wayne Gao is a Principal Engineer as Storage solution architect and worked on CSAL from PF to Alibaba commercial release. Wayne also takes main developer effort to finish CSAL pmem/DSA and cxl.mem PF from intel to Solidigm. Before joining Intel, Wayne has over 20 years of storage... Read More →

Wednesday June 11, 2025 13:45 - 14:15 HKT
Level 19 | Crystal Court II

Data Processing + Storage

Content Experience Level Intermediate
Presentation Language Chinese

13:45 HKT

The Next Steps for Ingress-NGINX and the Ingate Project - Jintao Zhang, Kong Inc.

Wednesday June 11, 2025 13:45 - 14:15 HKT

Level 21 | Pearl Pavilion

I will share the progress of the Ingress-NGINX project in this topic, as well as our newly incubated project, Ingate. Ingate is a project we created to actively adopt the Gateway API, and we will explore the next steps in the Ingate project based on the successes and failures we've experienced in the Ingress-NGINX project, along with user demands for frequently used features.

Speakers

Jintao Zhang

CNCF Ambassador, Kubernetes Ingress-NGINX maintainer, Kong Inc.

Jintao Zhang is a Microsoft MVP, CNCF Ambassador, Apache PMC, and Kubernetes Ingress-NGINX maintainer, he is good at cloud-native technology and Azure technology stack.

Wednesday June 11, 2025 13:45 - 14:15 HKT
Level 21 | Pearl Pavilion

Maintainer Track

13:45 HKT

Progressive Delivery Made Easy With Argo Rollouts - Kevin Dubois, Red Hat

Wednesday June 11, 2025 13:45 - 14:15 HKT

Level 19 | Crystal Court I

ou might already be using a CI/CD solution, but are you 100% sure things will roll out without a glitch once you go to production? Unfortunately differences between testing/staging and production environments are virtually unavoidable. There’s always a risk for unforeseen issues related to your production environment and/or actual load which can lead to potential disruptions to your users.

Progressive delivery is the next step after Continuous Delivery to roll out your application in a controlled and automated way so you can verify and test your application *in production* before it becomes fully available to all your user bases.

Embrace GitOps and Progressive Delivery with techniques like blue-green, canary release, shadowing traffic, dark launches and automatic metrics-based rollouts to validate the application in production using Kubernetes and tools like Istio, Prometheus, ArgoCD, and Argo Rollouts.

Come to this session to learn about Progressive Delivery in action using Kubernetes.

Speakers

Kevin Dubois

Senior Principal Developer Advocate, Red Hat

Kevin is a Java Champion, software engineer, author and international speaker with a passion for Open Source, Java, and Cloud Native Development & Deployment practices. He currently works as developer advocate at Red Hat where he gets to enjoy working with Open Source projects and... Read More →

Kubecon China 25 Progressive Delivery Made Easy with Argo Rollouts pdf

Wednesday June 11, 2025 13:45 - 14:15 HKT
Level 19 | Crystal Court I

Platform Engineering

Content Experience Level Intermediate
Presentation Language English

13:45 HKT

Connecting Dots: Unified Hybrid Multi-Cluster Auth Experience With SPIFFE and Cluster Inventory API - Chen Yu, Microsoft & Jian Zhu, Red Hat

Wednesday June 11, 2025 13:45 - 14:15 HKT

Level 16 | Grand Ballroom I

As the multi-cluster pattern continues to evolve, managing K8s identities, credentials, and permissions for teams and multi-cluster apps, such as Argo and Kueue, has become a hassle, typically involving managing individual service accounts on each cluster and passing credentials around. Such setup is often scattered, repetitive, difficult to track/audit, and may impose security and ops complications. This is especially true with hybrid environments, where different solutions could be in play across platforms.

This demo presents a solution based on OpenID, SPIFFE/SPIRE, and Cluster Inventory API from the Multi-Cluster SIG that provides a unified, seamless, and secure auth experience. Facilitated by CNCF multi-cluster projects, OCM and KubeFleet, attendees could be inspired to leverage open source solutions to eliminate credential sprawl, reduce operational complexity, and enhance security in hybrid cloud environments, when setting up teams/applications to access a multi-cluster setup.

Speakers

Chen Yu

Senior Software Engineer, Microsoft

Chen Yu is a senior software engineer at Microsoft with a keen interest in cloud-native computing. He is currently working on Multi-Cluster Kubernetes and contributing to the Fleet project open-sourced by Azure Kubernetes Service.

Jian Zhu

Senior Software Engineer, RedHat

Zhu Jian is a senior software engineer at RedHat, a speaker at Kubecon China 2024, and a core contributor to the open cluster management project. Jian enjoys solving multi-cluster workload distribution problems and extending OCM with add-ons.

Kubecon China 2025 Unified Hybrid Multi Cluster Auth Experience with SPIFFE and Cluster Inventory API pdf

Wednesday June 11, 2025 13:45 - 14:15 HKT
Level 16 | Grand Ballroom I

Security

Content Experience Level Intermediate
Presentation Language Chinese

13:45 HKT

Women's Community Gathering

Wednesday June 11, 2025 13:45 - 14:45 HKT

Level 20 | Salon 5

Strong communities foster a feeling of belonging by providing opportunities for interaction, collaboration, and shared experiences. We hope to do just that with a gathering of attendees who identify as women and non-binary individuals at KubeCon + CloudNativeCon China! Join fellow women community members for networking and connection.

Wednesday June 11, 2025 13:45 - 14:45 HKT
Level 20 | Salon 5

Inclusion + Accessibility

Content Experience Level Any
Presentation Language English

14:30 HKT

Leveraging Multi-Agent Dynamic Programming and Hierarchical Reflection for Next-Generation AI Decision-Making (Co-sight) - ShiQing Jiang, ZTE Corporation

Wednesday June 11, 2025 14:30 - 15:00 HKT

Level 21 | Emerald Pavilion

As AI tackles increasingly complex tasks, traditional LLMs show limitations in action decision-making and multi-step reasoning, making autonomous planning and dynamic correction key challenges. ZTE's Co-Sight agent system addresses this with a multi-agent (Plan-Actor) collaborative architecture. Its dual-level design separates planning (task decomposition, path generation) from execution, significantly reducing LLM search space. Dynamic task adjustment is achieved via DAG parallel thinking, dynamic context, guardrails, and hierarchical reflection. Co-Sight has demonstrated excellent performance on the GAIA benchmark, particularly showcasing superior stability in complex Level 2 multi-step tasks.

Speakers

ShiQing Jiang

Senior Software Engineer, ZTE Corporation

Wednesday June 11, 2025 14:30 - 15:00 HKT
Level 21 | Emerald Pavilion

Building the Future of Industrial AI with Open Source (By COIA)

14:30 HKT

Exploring KubeEdge Graduation: Build a Diverse and Collaborative Open Source Community From Scratch - Yue Bao & Fei Xu, Huawei; Hongbing Zhang, DaoCloud; Huan Wei, Hangzhou HarmonyCloud; Benamin Huo, QingCloud

Wednesday June 11, 2025 14:30 - 15:00 HKT

Level 19 | Crystal Court II

Recently, the health of open-source projects, particularly, vendor diversity and neutrality, has become a key topic of discussion. Many projects have faced challenges due to a lack of vendor diversity, threatening their sustainability. It is increasingly clear that setting up the right governance structure and project team during a project’s growth is critical.
KubeEdge, the industry's first cloud-native open-source edge computing project, has grown from its initial launch in 2018 to achieving CNCF graduation this year. Over the past few years, KubeEdge has evolved from a small project into a diverse, collaborative and multi-vendor open-source community
In this panel, we will discuss the lessons learned from KubeEdge community graduation journey, focusing on key strategies in technical planning, community governance, developer growth, and project maintenance. Join us to explore how to build a multi-vendor and diverse community, and how to expand into different industries.

Speakers

Huan Wei

Senior Technical Director, Hangzhou HarmonyCloud Technologies Co., Ltd

Huan is an open source enthusiast and cloud native technology advocate. He is currently the CNCF ambassador, and TSC member of KubeEdge project. He is serving as experienced technical director for HarmonyCloud.

Fei Xu

Senior software Engineer, Huawei

KubeEdge TSC Member, Senior Software Engineer at Huawei Cloud. Focusing on Cloud Native,Kubernetes, Service Mesh, EdgeComputing, EdgeAI and other fields. Currently maintaining the KubeEdge project which is a CNCF graduated project. And has rich experience in Cloud Native and EdgeComputing... Read More →

Benjamin Huo

KubeSphere founding member, KubeEdge TSC member, Director of Cloud Platform, QingCloud Technologies

Benjamin Huo leads QingCloud Technologies' Architect team and Observability Team. He is the founding member of KubeSphere and the co-author of Fluent Operator, Kube-Events, Notification Manager, OpenFunction, and most recently eBPFConductor. He loves cloud-native technologies especially... Read More →

Yue Bao

Senior Software Engineer, Huawei Cloud Computing Technology Co., Ltd.

Hongbing Zhang

KubeEdge TSC Member, Chief Operating Officer, DaoCloud

Wednesday June 11, 2025 14:30 - 15:00 HKT
Level 19 | Crystal Court II

Cloud Native Experience

Content Experience Level Any
Presentation Language Chinese

14:30 HKT

Building Custom GPU Clusters at Scale: Using Kubespray To Create High-Performance AI Infrastructure - Kay Yan, DaoCloud & Rong Zhang, vivo

Wednesday June 11, 2025 14:30 - 15:00 HKT

Level 21 | Pearl Pavilion

Kubespray, recognized by Kubernetes' SIG Cluster Lifecycle, deploys production-ready Kubernetes clusters on bare metal, enhancing performance for AI applications with robust GPU support. This session covers Kubespray's fundamentals, key features, and updates.

As AI workloads like LLMs grow, scalable GPU clusters are essential. Engineers will share insights from deploying custom GPU clusters at scale with Kubespray, discussing challenges and best practices. Attendees will learn to integrate Kubernetes technologies like LWS, Kueue, Gateway API Inference Extension, DRA, and tensor parallelism to enhance AI workloads like RAG and LoRA, improving resource utilization and performance.

We'll share Kubespray's inventory source code to customize AI clusters and use Kubernetes operators to define infrastructure in private clouds, enabling efficient cluster scaling.

Speakers

Rong Zhang

Senior software Engineer, vivo

Rong is a software engineer at vivo developing platform services on top of Kubernetes, providing containerized infrastructure. Focus on the closed loop system of scheduling、gpu technology、network and cluster management.

Kay Yan

Principal Software Engineer, DaoCloud

Kay Yan is kubespray maintainer, containerd/nerdctl maintainer. He is the Principal Software Engineer in DaoCloud, and develop the DaoCloud Enterprise Kubernetes Platform since 2016.

Building Custom GPU clusters at Scale v2 pptx

Wednesday June 11, 2025 14:30 - 15:00 HKT
Level 21 | Pearl Pavilion

Maintainer Track

14:30 HKT

The Past, the Present, and the Future of Platform Engineering - Mauricio "Salaboy" Salatino, Diagrid & Viktor Farcic, Upbound

Wednesday June 11, 2025 14:30 - 15:00 HKT

Level 19 | Crystal Court I

Do you think platform engineering is too hard? Or is it just a buzzword? Is the CNCF landscape too tricky to visualize? If you’ve been in this industry long enough, you should know that platform engineering has been around for a long time.

Most of us have been trying to build developer platforms for decades, and most of us have failed at that. That begs the questions: “What is different now?” “Why will this time be different?” and “Do we have a chance to succeed?”

We’ll take a look at the past, the present, and the future of platform engineering. We’ll see what we were doing in the past, what we did wrong, and why we failed. Further on, we’ll see what we (the industry as a whole) are doing now and, more importantly, where we might go from here.

Get ready for the hard truths and challenges you will face when trying to build a platform based on Kubernetes. Join us for a pain-infused journey filled with challenges teams will face when building platforms to enable other teams.

Speakers

Viktor Farcic

Viktor Farcic, Upbound

Viktor Farcic is a lead rapscallion at Upbound, a member of the CNCF Ambassadors, Google Developer Experts, CDF Ambassadors, and GitHub Stars groups, and a published author. He is a host of the YouTube channel DevOps Toolkit and a co-host of DevOps Paradox.

Mauricio Salatino

Software Engineer, Diagrid

Mauricio works as an Open Source Software Engineer at @Diagrid, contributing to and driving initiatives for the Dapr OSS project. Mauricio also serves as a Steering Committee member for the Knative Project and Co-Leading the Knative Functions initiative. He published a book titled... Read More →

Wednesday June 11, 2025 14:30 - 15:00 HKT
Level 19 | Crystal Court I

Platform Engineering

Content Experience Level Intermediate
Presentation Language English

14:30 HKT

Guardians of Multi-Tenancy: Enhanced Authorization To Prevent Lateral Node Escape - Dahu Kuang & Cheng Gao, Alibaba Cloud

Wednesday June 11, 2025 14:30 - 15:00 HKT

Level 16 | Grand Ballroom I

Maximizing security in multi-tenant clusters while maintaining cost-effectiveness is crucial for enterprise OPS. Most enterprise clusters deploy multiple daemonsets, which are attractive targets for attackers seeking to escape and move laterally, ultimately taking over the entire cluster.

The SIG community has introduced several advanced security features recently, such as CRD Field Selectors, Field and Label Selector Authorization, validating admission policy (VAP), and Structured Authorization Config. These allow users to define more flexible authorization configurations, addressing filtering and authorization needs for CRDs, kubelet, and other resources in multi-tenant environments.

We will share the lessons learned from the node escape incidents and demonstrate how to implement these new features and show how to use the Common Expression Language (CEL) to configure customized policies in Authorization Webhook and VAP, resulting more node-specific restrictions within clusters.

Speakers

Dahu Kuang

Senior Engineer, Alibaba Cloud

Dahu Kuang is a Security Tech Lead on the Alibaba Cloud Container Service for Kubernetes (ACK) team, focusing on the design and implementation of container security-related work, especially within the context of secure supply chain.

Cheng Gao

Senior Security Engineer, Alibaba Cloud

Cheng Gao, Senior Security Engineer at Alibaba Cloud, focuses on the Security Development Lifecycle (SDL) for cloud-native applications. With expertise in container services, observability, and Serverless architectures, Cheng has led security assurance for several internal container... Read More →

Guardians of Multi Tenancy Enhanced Authotization to Prevent Lateral Node Escape pdf

Wednesday June 11, 2025 14:30 - 15:00 HKT
Level 16 | Grand Ballroom I

Security

Content Experience Level Any
Presentation Language English

15:00 HKT

Coffee Break ☕

Wednesday June 11, 2025 15:00 - 15:30 HKT

Level 16 | Grand Ballroom II

Wednesday June 11, 2025 15:00 - 15:30 HKT
Level 16 | Grand Ballroom II

Breaks

15:30 HKT

Heterogeneous Hybrid Distributed Training for Large-Scale Language Models - Yanjun Chen, China Mobile, COIA

Wednesday June 11, 2025 15:30 - 16:00 HKT

Level 21 | Emerald Pavilion

With the development of AI technology, the demand for computing power for large model training has accelerated the deployment of AI infrastructure. Data centers often have a "resource wall" problem between AI acceleration hardware of different generations and manufacturers, which caused the incompatibility issue of software and hardware stack. Thus, it’s a big challenge for AI infra operators to maximize resource utilization. This topic focuses on technical solutions for collaborative training using chips of different architectures, sharing the practices on solving key problems such as heterogeneous training task splitting, heterogeneous training performance prediction, and heterogeneous hybrid communication and etc.. The project has been open sourced and will be further improved with better maturity through the community.

Speakers

Yanjun Chen

TOC member at COIA, China Mobile, COIA

Wednesday June 11, 2025 15:30 - 16:00 HKT
Level 21 | Emerald Pavilion

Building the Future of Industrial AI with Open Source (By COIA)

15:30 HKT

Ask the Experts: CNCF CTO and TOC Members Open Q&A - Hosted by Chris Aniszczyk, Lin Sun & Kevin Wang

Wednesday June 11, 2025 15:30 - 16:00 HKT

Level 21 | Pearl Pavilion

Join this interactive session for a brief overview of the Cloud Native Computing Foundation (CNCF) Technical Oversight Committee (TOC), including recent initiatives and opportunities to get involved. Learn how the TOC is helping shape the next decade of cloud native technologies, and how you can get involved. Following the overview, we’ll open the floor to your questions—whether they’re technical, or about building leadership within CNCF.
Initial seeding questions include:

What are some of the latest Cloud Native AI initiatives?
How can we encourage more CNCF and TAG contributions from Asian countries?
What are the possible paths to becoming a CNCF TOC member?

Speakers

Kevin Wang

Technical Expert, Lead of Cloud Native Open Source, Huawei

Lin Sun

Head of Open Source & CNCF TOC, Solo.io

Chris Aniszczyk

CTO, CNCF

Chris Aniszczyk is an open source executive and engineer with a passion for building a better world through open collaboration. He's currently a CTO at the Linux Foundation focused on developer relations and running the Open Container Initiative (OCI) / Cloud Native Computing Foundation... Read More →

Wednesday June 11, 2025 15:30 - 16:00 HKT
Level 21 | Pearl Pavilion

Cloud Native Experience

15:30 HKT

Stability in Large Model Training: Practices in Software and Hardware Fault Self-Healing - Yang Cao, Ant Group

Wednesday June 11, 2025 15:30 - 16:00 HKT

Level 19 | Crystal Court II

Training trillion-parameter AI models requires significant GPU resources, where any idle time leads to increased costs. Maintaining full-speed GPU utilization is crucial, yet hardware and software failures (such as firmware, kernel, or hardware issues) often disrupt large-scale training. For example, LLaMA3 experienced 419 interruptions over 54 days, with 78% due to hardware issues, underscoring the necessity for automated anomaly recovery.
At Ant Group, we will share:
GPU Monitoring: Comprehensive monitoring from hardware to applications to ensure optimal performance.
Self-Healing for Large GPU Clusters: Automated fault isolation, recovery from kernel panics, and node reprovisioning for clusters with 10,000+ GPUs.
Core Service Level Objectives (SLOs): Achieving over 98% GPU availability and more than 90% automatic fault isolation.
Predictive Maintenance: Using failure pattern analysis to reduce downtime and improve reliability.

Speakers

Yang Cao

senior engineer, Ant Group

Yang Cao Senior Engineer, Ant Group Yang Cao is a senior engineer at Ant Group, currently focusing on ensuring the stability of large-scale distributed training on Kubernetes.

stability in large model training practices in software and hardware fault self healing pdf

Wednesday June 11, 2025 15:30 - 16:00 HKT
Level 19 | Crystal Court II

Cloud Native Experience

Content Experience Level Intermediate
Presentation Language Chinese

15:30 HKT

Policy as Code: Past, Present and Future for Novice - Hoon Jo, Megazone

Wednesday June 11, 2025 15:30 - 16:00 HKT

Level 16 | Grand Ballroom I

When you're new to Kubernetes, Policy as Code (PaC) can be a very unfamiliar topic. But as you get more familiar with Kubernetes, you'll probably be interested in how you can use it securely, especially since Kubernetes is essentially a declarative system via YAML, so having security also be done in code will help with usability and reducing human error.

In order to make PaC easier to understand, I'll demonstrate the Admission Control part directly in Kubernetes. Until recently, this part was based on webhooks, but since v1.23, the decision to actively embrace the Common Expression Language (CEL) has made it possible to apply it as code directly inside Kubernetes. Validating Admission Policy became GA in v1.30, and Mutating Admission Policy is in Alpha in v1.32.

Based on this outline, I'll talk about how PaC has been applied to Kubernetes in the past, how it works today, and finally, how we can expect it to be integrated into Kubernetes in the future.

See you at the session! 🙂

Speakers

Hoon Jo

Cloud Solutions Architect, Cloud Native Engineer, Megazone

Hoon Jo is Cloud Solutions Architect as well as Cloud Native engineer at Megazone. He has many times of speaker experience for cloud native technologies. And spread out Cloud Native Ubiquitous in the world. He has written several books and latest books is 『CONTAINER INFRASTRUCTURE... Read More →

Policy as Code Past, Present and Future for Novice v1.0.0 pdf

Wednesday June 11, 2025 15:30 - 16:00 HKT
Level 16 | Grand Ballroom I

Cloud Native Novice

Content Experience Level Beginner
Presentation Language English

15:30 HKT

Composable Platforms: Modular Platform Engineering With Kratix and Backstage - Hossein Salahi, Liquid Reply

Wednesday June 11, 2025 15:30 - 16:00 HKT

Level 19 | Crystal Court I

Constructing and managing platforms for diverse teams and workloads presents a significant challenge in today's cloud-native environment. This session introduces the concept of composable platforms, using modular, reusable components as the foundation for platform engineering. This talk will demonstrate how using Kratix, a workload-centric framework, and Backstage an extensible developer portal enables the creation of self-service platforms that balance standardization with adaptability.

The session will detail platform design for scalability and governance, streamlining developer workflows through Backstage, and using Kratix Promises for varied workload requirements. Attendees will gain practical insights into building scalable and maintainable platforms through real-world examples, architectural patterns, and a live demonstration of a fully integrated Kratix-Backstage deployment.

Speakers

Hossein Salahi

Tech Lead, Liquid Reply

Hossein is an experienced cloud computing professional with nearly a decade of expertise in distributed systems and cloud technologies. He began as a student specializing in cloud automation and progressed to a full-time role focusing on on-premises cloud infrastructure and containers... Read More →

Wednesday June 11, 2025 15:30 - 16:00 HKT
Level 19 | Crystal Court I

Platform Engineering

Content Experience Level Intermediate
Presentation Language English

16:15 HKT

Taming Dependency Chaos for LLM in K8s - Peter Pan, Neko Ayaka & Kebe Liu, DaoCloud

Wednesday June 11, 2025 16:15 - 16:45 HKT

Level 19 | Crystal Court I

AI developer in K8S: either in Jupyter notebook or LLM serving: Python Dependency is always a headache :
- Prepare a set of base Images? The maintenance amounts & efforts will be a nightmare: Since (1) packages in AI world are rapidly version bumping, (2) diff llm codes require diff packages permutation/combination.
- Leave users to `pip install` by themselves ? The resigned waiting blocks productivity and efficiency. You may agree if you did it.
- If on a GPU Cloud, the pkg preparation time may even cost a lot: you rent a GPU but wasted in waiting pip downloading...
- you may choose to D.I.Y: docker-commit your own base-images, but you have to worry about the Dockerfile, registry and additional cloud cost if you don't have local docker env.

----
So we introduce https://github.com/BaizeAI/dataset.

The solution:
1. A CRD to describe the dependency and env.
2. K8S Job to pre-load the packages.
3. PVC to store and mount
4. `conda` to switch from envs
5. share between namespaces

Speakers

Peter Pan

R&D Engineering VP, Daocloud

- DaoCloud Software Engineering VP- Regular KubeCon "Program Committee" : 2023 EU, 2024 HK, 2024 India, 2025 EU- Regular KubeCon Speaker: 2023 SH, 2024 EU, 2024 HK- Maintainer of below CNCF projects : cloudtty, kubean, hwameistor- CNCF WG-AI (AI Working-Group) Member + CNAI white-paper... Read More →

Kebe Liu

DaoCloud, Senior software engineer, DaoCloud

AI Infra and Service Mesh Team Lead at DaoCloud. Member of Istio Steering Committee. Creator of open source projects such as Merbridge and kcover.

Neko Ayaka

Senior Software Engineer, DaoCloud

Cloud native developer, AI researcher, Gopher with 5 years of experience in loads of development fields across AI, data science, backend, frontend. Co-founder of https://github.com/nolebase

KubeCon HK 2025.06 Taming Dependency Chaos for LLM in K8S pdf

Wednesday June 11, 2025 16:15 - 16:45 HKT
Level 19 | Crystal Court I

Application Development

Content Experience Level Any
Presentation Language English

16:15 HKT

Apache Gravitno: unified metadata lake for Data and AI - Shaofeng Shi, Datastrato

Wednesday June 11, 2025 16:15 - 16:45 HKT

Level 21 | Emerald Pavilion

In the AIera, enterprises need to collect more data to build high-quality AI applications, including structured data (databases, data warehouses, etc.) and unstructured data (data lakes, document libraries, real-time data, etc.). Data integrity and compliance play a key role in building AI applications, which is the value of metadata. Providing AI users with a unified data view so that they can better discover and use multi-source heterogeneous data, including data discovery, data semantics, data lineage, data permissions, etc., and managing the data life cycle in combination with enterprise governance needs to avoid resource waste and security issues, has become a strong need for every enterprise.

Apache Gravitino provides a unified API to access multiple data sources and multiple data storages, supports multiple data engines and machine learning frameworks to access data, and implements unified naming, unified permissions, unified lineage, unified auditing and other functions based on unified metadata, thereby greatly simplifying the data operation and breaking the data silos. At present, it has been adopted by companies such as Xiaomi, Bilibili, Pinterest, and Uber, and has achieved good results. This session will introduce the background, architecture, core functions and use cases of Gravitino.

Speakers

Shaofeng Shi

VP of Engineering, Apache Incubator PMC, Datastrato

Wednesday June 11, 2025 16:15 - 16:45 HKT
Level 21 | Emerald Pavilion

Building the Future of Industrial AI with Open Source (By COIA)

16:15 HKT

High-Performance Cloud Native Traffic Authentication Solutions - Muyang Tian & Zengzeng Yao, Huawei

Wednesday June 11, 2025 16:15 - 16:45 HKT

Level 19 | Crystal Court II

In the rapidly evolving landscape of cloud computing and microservices architecture, efficiently and securely managing communication between services has become a critical challenge. Traditional methods of network traffic authentication often become a performance bottleneck, especially when handling large-scale data flows. This session introduces an innovative solution — leveraging Linux kernel technology XDP (eXpress Data Path) to achieve efficient traffic authentication for service-to-service communications.

We will delve into how to use XDP for rapid filtering and processing of packets before they enter the system's protocol stack, significantly reducing latency and enhancing overall system throughput. Additionally, we will share practical application experiences from projects such as Kmesh, including but not limited to performance tuning, security considerations, and integration with other network security strategies.

Speakers

Zengzeng Yao

Senior Software Engineer, Huawei

Zengzeng is a senior software engineer from Huawei. and he is also a kmesh maintainer with rich experience on service mesh.

Muyang Tian

Operating System Engineer, Huawei

Operating system engineer of Huawei Technologies Co., Ltd., core member of Kmesh, contributor of libxdp. Enthusiastic about cloud native technology and eBPF-based high performance network.

Wednesday June 11, 2025 16:15 - 16:45 HKT
Level 19 | Crystal Court II

Security

Content Experience Level Any
Presentation Language Chinese