CNCF-hosted Co-located Events North America 2024: Full Schedule

arrow_back View All Dates

9:00am MST

Cloud Native + Kubernetes AI Day | Welcome + Opening Remarks - Amber Graner, Rajas Kakodkar, Ricardo Rocha, Program Co-Chairs

Tuesday November 12, 2024 9:00am - 9:05am MST

Salt Palace | Level 1 | Grand Ballroom A

Speakers

Rajas Kakodkar

Staff Software Engineer at Broadcom | Tech Lead CNCF TAG Runtime, Broadcom

Rajas is a staff software engineer at Broadcom and a tech lead of the CNCF Technical Advisory Group, Runtime. He is actively involved in the AI working group in the CNCF. He is a Kubernetes contributor and has been a maintainer of the Kube Proxy Next Gen Project. He has also served... Read More →

Ricardo Rocha

Lead Platforms Infrastructure, CERN

Ricardo leads the Platform Infrastructure team at CERN with a strong focus on cloud native deployments and machine learning. He has led for several years the internal effort to transition services and workloads to use cloud native technologies, as well as dissemination and training... Read More →

Amber Graner

product owner

I’m a seasoned professional with a rich history in open source communities–Ubuntu, Linaro, Open Compute Project Foundation, Zeek, Kubeflow and more. I’m known for my leadership skills and commitment to inclusivity. I served as an all-source intelligence analyst in the US Army... Read More →

Tuesday November 12, 2024 9:00am - 9:05am MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day

Content Experience Level Any
Event + Breaks Cloud Native + Kubernetes AI Day

9:10am MST

SkyRay: Seamlessly Extending KubeRay to Multi-Cluster Multi-Cloud Operation - Anne Holler, Elotl

Tuesday November 12, 2024 9:10am - 9:35am MST

Salt Palace | Level 1 | Grand Ballroom A

Ray is a unified framework for scaling AI applications from a laptop to a cluster. KubeRay supports the creation, deletion, and scaling of Ray clusters on K8s, along with managing Ray jobs and services on the Ray clusters. This talk introduces SkyRay, in which KubeRay is extended towards the Sky computing model via interoperation with a multi-cluster fleet manager. With SkyRay, each Ray cluster is seamlessly scheduled onto a cloud K8s cluster suited to the Ray cluster's resource needs and policy requirements. The policies can capture a variety of cluster characteristics, e.g., desired cloud provider, region, K8s version, service quality, and GPU type availability. Fleet manager policy updates can be used to trigger automatic migration of Ray clusters between K8s workload clusters. The talk presents several example use cases for SkyRay, including cluster selection for resource needs, service availability, development vs production cluster configuration, and K8s version upgrade.

Speakers

Anne Holler

Chief Scientist, Elotl

Anne is Chief Scientist at Elotl. She is interested in resource efficiency. She worked on Uber's Michelangelo Machine Learning platform, on Velocloud's SD-WAN management, on VMware's Distributed Resource Schedulers for servers and storage, on performance analysis for VMware, on Transmeta's... Read More →

AIDaySkyRay pdf

Tuesday November 12, 2024 9:10am - 9:35am MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Best practices for ML Infrastructure

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

9:45am MST

Attack, Defense & Danger in the Age of AI - Shane Lawrence, Shopify

Tuesday November 12, 2024 9:45am - 10:10am MST

Salt Palace | Level 1 | Grand Ballroom A

The transformers revolution has spurred a race between hackers looking for an edge and security teams looking for leverage, while organizations of all kinds rush to make use of this new technology with little awareness of how it works and how it could be used against them. In this talk, Shane will describe some of the ways that AI is being used by attackers, countermeasures for AI attacks, opportunities for AI to mitigate conventional attacks, and how AI-powered services might be used against their owners. He’ll show a live demo combining these concepts. Attendees will learn about the AI-related risks they face and techniques for managing those risks.

Speakers

Shane Lawrence

Senior Staff Infrastructure Security Engineer, Shopify

Shane is a Senior Staff Infrastructure Security Engineer at Shopify, where he's working on a multi-tenant platform that allows developers to securely build scalable apps and services for crafters, entrepreneurs, and businesses of all sizes.

Shane AIDay KCNA24 pptx

Tuesday November 12, 2024 9:45am - 10:10am MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Security and Observability of AI workloads on Kubernetes

Content Experience Level Any
Event + Breaks Cloud Native + Kubernetes AI Day

10:15am MST

Sponsored Keynote: Advancing Cloud Native AI Innovation Through Open Collaboration - Yuan Tang, Red Hat

Tuesday November 12, 2024 10:15am - 10:20am MST

Salt Palace | Level 1 | Grand Ballroom A

In the rapidly evolving field of AI, innovation flourishes through the open exchange of ideas, resources, and knowledge. In this keynote, we will delve into Red Hat’s journey in cloud native and AI, showcasing our community-driven efforts and initiatives that promote a culture of open collaboration within the cloud native AI ecosystem. We invite you to join us in this collaborative effort and explore opportunities to contribute to and benefit from a vibrant community.

Speakers

Yuan Tang

Principal Software Engineer, Red Hat

Yuan is a principal software engineer at Red Hat, working on OpenShift AI. Previously, he has led AI infrastructure and platform teams at various companies. He holds leadership positions in open source projects, including Argo, Kubeflow, and Kubernetes. He's also a maintainer and... Read More →

Cloud Native Kubernetes AI Day NA 2024 Red Hat Keynote Yuan Tang pdf

Tuesday November 12, 2024 10:15am - 10:20am MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day

Content Experience Level Any
Event + Breaks Cloud Native + Kubernetes AI Day

10:25am MST

Sponsored Keynote: The Evolution of MLOps - Alex Yeh, GMI Cloud

Tuesday November 12, 2024 10:25am - 10:30am MST

Salt Palace | Level 1 | Grand Ballroom A

The need to evolve from DevOps to MLOps arises from the unique challenges that machine learning (ML) systems bring, which traditional DevOps processes aren’t equipped to handle. While DevOps focuses on software development and operations, MLOps is necessary because ML models introduce complexities related to data, model lifecycle, and experimentation that go beyond typical software management.

Speakers

Alex Yeh

CEO, GMI Cloud

CEO & Founder of GMI Cloud

Tuesday November 12, 2024 10:25am - 10:30am MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day

Content Experience Level Any
Event + Breaks Cloud Native + Kubernetes AI Day

10:30am MST

AM Break 3

Tuesday November 12, 2024 10:30am - 10:40am MST

Breaks

Event + Breaks Observability Day, Cloud Native + Kubernetes AI Day, ArgoCon, Cloud Native University, Platform Engineering Day

10:40am MST

Multitenancy and Fairness at Scale with Kueue: A Case Study - Aldo Culquicondor, Google & Rajat Phull, Apple

Tuesday November 12, 2024 10:40am - 11:05am MST

Salt Palace | Level 1 | Grand Ballroom A

Developed by the Kubernetes community in collaboration with the ecosystem, Kueue augments k8s and ClusterAutoscaler to provide an E2E batch system. Kueue implements job queueing, deciding when jobs should wait and when they should start or be preempted, based on quotas and a hierarchy for sharing resources among teams. An exciting addition in the v0.7 release is fair sharing, designed to support large ML platforms serving multiple teams. Kueue allows platforms to model their teams and achieve a high utilization of resources, while sharing cost and providing equitative access to unused resources. Teams can always reclaim their guaranteed quotas via preemption. The Kueue v0.7 and the Kubernetes v1.31 releases also include performance optimizations to achieve high throughput. In this talk, you will learn about the challenges faced during design and implementation of fair sharing and preemption, about this system running in production, and the plans to support complex hierarchies.

Speakers

Aldo Culquicondor

Sr. Software Engineer, Google

Aldo is a Senior Software Engineer at Google. He works on Kubernetes and Google Kubernetes Engine, where he contributes to kube-scheduler, the Job API and other features to support batch, AI/ML and HPC workloads. He is currently a TL at SIG Scheduling and an active member of WG Batch... Read More →

Rajat Phull

Engineering Manager, Apple

Rajat Phull is an Engineering Manager at Apple. He works in Machine Learning Platform team with a focus on GPU resource management, and ML training orchestration at scale using Kubernetes.

Kubecon AI Day Kueue Fair Sharing.pptx pdf

Tuesday November 12, 2024 10:40am - 11:05am MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Best practices for ML Infrastructure

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

11:15am MST

LLM Powered Agents with Kubernetes - Hema Veeradhi & Shrey Anand, Red Hat

Tuesday November 12, 2024 11:15am - 11:40am MST

Salt Palace | Level 1 | Grand Ballroom A

How would you build an LLM system to modify a Kubernetes deployment based on its live telemetry data stream? A vanilla LLM is not enough to solve this problem as it is limited to outdated training data and is prone to hallucinations. In this talk, we will explore the concept of Agents—a powerful framework for solving complex multi-level tasks using a LLM as its reasoning engine, supported by a suite of tools. These tools can be advanced calculators, real time web scrapers, domain knowledge extractors, etc. They include executable functions, RAG pipelines, APIs or other services that allow the agents to complete their tasks effectively. We will walk-through a demo that leverages Kubernetes services and Podman containerization techniques that enable the agent workflow. Attendees will learn how a Kubernetes based agent framework enhances the performance capabilities of LLMs, offering a scalable and autonomous solution for next-generation intelligent systems.

Speakers

Shrey Anand

Mr., Red Hat

Shrey Anand is a data scientist with over five years of experience in the field of AI / ML. He collaborates with the emerging technologies at Red Hat where he develops cutting-edge data science solutions to solve open source and business problems. As a strong advocate of open source... Read More →

Hema Veeradhi

Principal Data Scientist, Red Hat

Hema Veeradhi is a Principal Data Scientist working in the Emerging Technologies team part of the office of the CTO at Red Hat. Her work primarily focuses on implementing innovative open AI and machine learning solutions to help solve business and engineering problems. Hema is a staunch... Read More →

KubeCon NA '24 LLM Powered Agents with Kubernetes pdf

Tuesday November 12, 2024 11:15am - 11:40am MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Best practices for ML Infrastructure

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

11:50am MST

From Supercomputing to Serving: A Case Study Delivering Cloud Native Foundation Models - Autumn Moulder, Cohere

Tuesday November 12, 2024 11:50am - 12:15pm MST

Salt Palace | Level 1 | Grand Ballroom A

Cloud native takes on new meaning in the AI and HPC domains. What does cloud native mean when your software is tightly coupled to hardware? When capacity is fixed, which assumptions start to break down? How can you flex GPUs batch training workloads and inference? Join us for a case study, demonstrating how a small team scaled ML infrastructure from a single cloud to multiple clusters across 4 cloud providers - in under 6 months. We’ll share unique multi-cloud challenges we uncovered around supercomputing infrastructure, cross cloud networking, capacity & quota management, batch workloads, FinOps, and observability. We will particularly highlight our experience using Kueue to manage fixed capacity across clouds & where Kubernetes still falls short for HPC workloads. Leave with a solid understanding of what it takes for an infrastructure team to support the lifecycle of a cloud native foundation model.

Speakers

Autumn Moulder

Director of Infrastructure & Security, Cohere

Autumn is the Director of Infrastructure & Security at Cohere. She’s been with the company since September 2022 scaling teams & tools. Prior to buying into the startup life, she spent 3 years in financial services and 14 years at a large non-profit. Her passion is helping innovative... Read More →

Tuesday November 12, 2024 11:50am - 12:15pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Best practices for ML Infrastructure

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

12:30pm MST

⚡ Lightning Talk: Charm++ on Kubernetes Cloud - Aditya Bhosale, University of Illinois at Urbana-Champaign

Tuesday November 12, 2024 12:30pm - 12:40pm MST

Salt Palace | Level 1 | Grand Ballroom A

In this talk, we will detail the use of Kubernetes operators to run HPC applications using Charm++ runtime system on Kubernetes cluster on cloud. Charm++ is an adaptive intelligent runtime system that provides capabilities such as dynamic load balancing, energy optimizations, and communication optimizations, in addition to support for resource elasticity. It is a well-established system in the HPC world, and supports highly scalable applications such as NAMD for biomolecular simulations. We will talk about capabilities added to the setup like job malleability utilizing the shrink and expand feature in Charm++ jobs and by changing the number of pods assigned to a job at run-time. We will demonstrate effectiveness of shrink/expand operations for different scheduling policies and quantify the associated overhead. Charm++ has recently added support for python-based framework, Charm4py, for python codes for HPC. We will also talk about running Charm4Py applications on Kubernetes.

Speakers

Aditya Bhosale

Graduate Student, University of Illinois at Urbana-Champaign

Tuesday November 12, 2024 12:30pm - 12:40pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Use cases for native Kubernetes batch and serving workloads

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

12:40pm MST

Lunch Break 3

Tuesday November 12, 2024 12:40pm - 1:30pm MST

Salt Palace | Level 1 | Hall C + Hyatt | Level 2 | South Foyer

Food and Beverage stations located in Salt Palace Hall C, Level 1. Seating is available in the Grand Ballroom and Hall D.

Food and Beverage stations and seating located at the Hyatt Regency in Salt Lake Ballroom C, Level 2.

Pre-confirmed Halal or Kosher meals will be available in Hall D.

Tuesday November 12, 2024 12:40pm - 1:30pm MST
Salt Palace | Level 1 | Hall C + Hyatt | Level 2 | South Foyer

Breaks

Event + Breaks AppDeveloperCon, Cloud Native + Kubernetes AI Day

1:30pm MST

Dressing-up Your Cluster for AI in Minutes with a Portable Network CR - Sunyanan Choochotkaew & Tatsuhiro Chiba, IBM Research

Tuesday November 12, 2024 1:30pm - 1:55pm MST

Salt Palace | Level 1 | Grand Ballroom A

Kubernetes network overhead and complexity is one of the impediments of Cloud adoption for AI, especially when considering using multiple networks to boost bandwidth for distributed tasks. Defining a network configuration for secondary interfaces in a static way is not a trivial task for platform engineers to meet the distinctive demands of heterogeneity and scale within a virtual-private-cloud cluster. In this talk, we show how deploying a single portable custom resource can play a significant role in transforming a VPC cluster into a supercomputer tailored for AI workloads. We share our journey of the Multi-NIC CNI project and demonstrate the benefit of seamlessly enabling dynamicity in network attachment definitions via practical use cases, along with outlining future directions towards the related open source projects like Multus, Node Resource Interface (NRI), Dynamic Resource Allocation (DRA), and Kubernetes Networking Interface (KNI).

Speakers

Tatsuhiro Chiba

Senior Technical Staff Member, IBM Research

Tatsuhiro Chiba is a STSM and Manager at IBM Research, specialized in performance optimization and acceleration of large scale AI and HPC workloads on Hybrid Cloud. He is leading a project to enhance OpenShift performance and sustainability for AI and HPC by exploiting various cloud... Read More →

Sunyanan Choochotkaew

Staff Research Scientist, IBM

Sunyanan Choochotkaew is working at IBM Research - Tokyo, specializing in cloud platform optimization. She actively contributes to various open-source projects, including Kepler, Multi-NIC CNI, and CPE operator, where she holds the role of maintainer. She has also made contributions... Read More →

dressing up with single cr pdf

Tuesday November 12, 2024 1:30pm - 1:55pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Hardware acceleration and device management

Content Experience Level Any
Event + Breaks Cloud Native + Kubernetes AI Day

2:05pm MST

Boosting Training and Inference Performance via Topology-Aware Scheduling of Heterogeneous Resources - He Cao, ByteDance

Tuesday November 12, 2024 2:05pm - 2:30pm MST

Salt Palace | Level 1 | Grand Ballroom A

As LLMs rapidly evolve, K8s’ topology management can not meet the performance demands in several aspects: 1. For new-generation high-density processors, NUMA affinity is insufficient to ensure inference performance. 2. The performance bottleneck has shifted from computation to networking. However, K8s does not consider the topology of heterogeneous resources like GPU and RDMA.

In this talk, He will introduce how ByteDance significantly improves LLM workload performance by enhancing topology-aware scheduling: 1. For nodes with high-density processors, achieve die-level affinity and implement anti-affinity between memory bandwidth-intensive pods. 2. For pods within a training job, achieve inter-RDMA affinity at the ToR level to avoid switch congestion. 3. For inference workloads, achieve GPU-RDMA affinity at PCIe switch level to enable GPUDirect RDMA for accelerated communication. 4. How we achieve job-level topology affinity based on K8s scheduler which operates at the pod level.

Speakers

He Cao

Senior Software Engineer, ByteDance

He Cao is a senior software engineer on the Cloud Native team at ByteDance, a maintainer of Katalyst and KubeZoo, and a member of Istio. He has 5+ years of experience in the cloud native area. Since joining ByteDance, he has designed and implemented several critical systems for VKE... Read More →

Tuesday November 12, 2024 2:05pm - 2:30pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Hardware acceleration and device management

Content Experience Level Any
Event + Breaks Cloud Native + Kubernetes AI Day

2:40pm MST

Brag Your RAG with the MLOPS Swag - Madhav Sathe, Google & Jitender Kumar, publicissapient

Tuesday November 12, 2024 2:40pm - 3:05pm MST

Salt Palace | Level 1 | Grand Ballroom A

Organizations are beginning to unlock significant value by integrating Large Language Models (LLMs) & Retrieval-Augmented Generation (RAG) into their business-critical processes. However, enterprises often face challenges in meeting the high expectations of GenAI-driven business outcomes. Bridging this gap requires meticulous planning in governance, continuous evaluation, seamless scaling, operational costs, and time-to-market. In this session, attendees will witness a live demonstration of a RAG application stack built with LangChain, Canopy, and a PostgreSQL Vector database, all deployed on Kubernetes. Additionally, we will discuss leveraging GPU and TPU accelerators to enhance computational efficiency. The audience will also gain insights into MLOps strategies for data splitting, embeddings, retrieval, and prompt engineering. Join us to explore how to effectively leverage MLOps with Kubernetes to achieve scalable and impactful GenAI solutions.

Speakers

Jitender Kumar

Director Technology- Devops and Cloud, Publicis Sapient

20+ years of successful IT and Delivery management experience leading mission critical infrastructure, software development and implementation projects involving strategic business and technology change and providing measurable financial results for the organization. Worked with Financial... Read More →

Madhav Sathe

Principal Architect, Google

Madhav helps major enterprises drive innovation using modern application architectures, containers and DevOps. Madhav has been a speaker at conferences such as SpringOne, Cloud Foundry Summit and Oracle OpenWorld. He has co-authored a white paper on container security. Madhav currently... Read More →

Tuesday November 12, 2024 2:40pm - 3:05pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Batch/AI/ML workflow pipelines, metadata and reproducibility

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

3:05pm MST

PM Break 1

Tuesday November 12, 2024 3:05pm - 3:20pm MST

Breaks

Event + Breaks AppDeveloperCon, BackstageCon, OpenTofu Day, Cilium + eBPF Day, Cloud Native + Kubernetes AI Day, Kubernetes on Edge Day, Observability Day, Platform Engineering Day

3:20pm MST

Reproducible AI with Kubeflow, lakeFS and Langchain - Oz Katz, Treeverse

Tuesday November 12, 2024 3:20pm - 3:45pm MST

Salt Palace | Level 1 | Grand Ballroom A

Langchain has become one of the most popular frameworks for anyone building custom, generative AI-driven apps powered by LLMs, that leverage RAG (Retrieval-Augmented Generation) for the most enhanced results. But like all data products, these applications are really only as good as the organizational data fed into them––and we’ve all learned the hard way that the data is oftentimes far from perfect. In this hands-on tutorial you’ll learn how to build a reproducible AI application pipeline with Kubeflow, Langchain and lakeFS, widely adopted OSS tools in the ML & GenAI stack. By learning how to build a RAG chatbot, while iteratively tuning it for best results leveraging lakeFS’s temporal versions, you’ll come away with improved methods for data reproducibility for your custom AI apps, that provide better data quality, alongside an improved user experience for your application users.

Speakers

Oz Katz

Co-Founder, CTO, lakeFS

Oz Katz is the CTO and Co-Creator of the open source lakeFS Project, an open source platform that delivers resilience and manageability to object-storage based data lakes. Oz engineered and maintained petabyte-scale data infrastructure at analytics giant SmilarWeb, which he joined... Read More →

Tuesday November 12, 2024 3:20pm - 3:45pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Batch/AI/ML workflow pipelines

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

3:55pm MST

Inference on Streaming Data at Scale at Intuit - Sri Harsha Yayi & Vigith Maurice, Intuit

Tuesday November 12, 2024 3:55pm - 4:20pm MST

Salt Palace | Level 1 | Grand Ballroom A

At Intuit, ML teams faced challenges with processing and running inference on high throughput streaming data. Connecting to various messaging systems like Kafka, Pulsar, and SQS proved to be a time-consuming and intricate process. Moreover, our ML teams required the ability to perform intermediate processing and execute inference as part of their workflows. To further complicate, scaling the processing and inference based on the volume of events introduced additional challenges. Based on challenges, we created Numaflow, a K8s native open-source platform for scalable event processing. It simplifies connecting to event sources, enables teams to do event processing and inference on streaming data without a learning curve, and integrates seamlessly with existing systems. This talk is for ML engineers, data scientists, and those interested in asynchronous inference on streaming data. We'll show how Numaflow overcomes obstacles and streamlines inference on streaming data

Speakers

Vigith Maurice

Principal Engineer, Intuit

Vigith is a co-creator of Numaproj and Principal Software Engineer for the Intuit Core Platform team in Mountain View, California. One of Vigith's current day-to-day focus areas is the various challenges in building scalable data and AIOps solutions for both batch and high-throughput... Read More →

Sri Harsha Yayi

Product Manager, Intuit

Sri Harsha Yayi is a Product Manager at Intuit, where he primarily focuses on the company's Modern SaaS Kubernetes platform, specifically within the event-driven systems domain. He is the PM for Numaflow, an open-source, Kubernetes native platform designed for the development of event-driven... Read More →

Tuesday November 12, 2024 3:55pm - 4:20pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Best practices for processing massive amounts of data

Content Experience Level Beginner
Event + Breaks Cloud Native + Kubernetes AI Day

4:30pm MST

Incremental GPU Slicing in Action - Abhishek Malvankar & Olivier Tardieu, IBM Research

Tuesday November 12, 2024 4:30pm - 4:55pm MST

Salt Palace | Level 1 | Grand Ballroom A

Large language models are often released as families of models with varying parameter counts and quantization. To reduce cost, inference services increasingly rely on dynamic model selection, preferring smaller models when possible. GPU vendors are on a journey to enable dynamic GPU slicing, making it possible for a workload to request a fraction of the compute and memory units in a GPU, and for the slices to be created and destroyed on demand without disrupting existing workloads. The onus is now on Kubernetes. The Device Management Working Group is hard at work to expose these capabilities. While vendor-agnostic slicing APIs do not exist yet, this talk demonstrates that incremental GPU slicing is possible today. We replace the Multi-Instance GPU manager, which only permits partitioning GPUs in bulk, with an open-source incremental-slicing controller without needing new APIs or changes to the device plugin. Come learn how to achieve incremental slicing in your GPU clusters.

Speakers

Abhishek Malvankar

Senior Software Engineer, IBM Research

Abhishek is a Senior Software Engineer and Master Inventor at IBM Research and Co-chairs CNCF Batch System Initiative. He focuses on resource management, performance, and distributed computing for AI workloads in the cloud. Abhishek enjoys designing easy-to-use solutions for the cloud... Read More →

Olivier Tardieu

Principal Research Scientist, Manager, IBM

Dr. Olivier Tardieu is a Principal Research Scientist and Manager at IBM T.J. Watson, NY, USA. He joined IBM Research in 2007. His current research focuses on cloud-related technologies, including Serverless Computing and Kubernetes, as well as their application to Machine Learning... Read More →

InstaSlice pdf

Tuesday November 12, 2024 4:30pm - 4:55pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Hardware acceleration and device management

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

5:00pm MST

⚡ Lightning Talk: Transform Your Kubernetes Cluster Into a GenAI Platform: Get Ready-to-Use LLM APIs Today! - Kenji Kaneda, CloudNatix Inc.

Tuesday November 12, 2024 5:00pm - 5:10pm MST

Salt Palace | Level 1 | Grand Ballroom A

Are you eager to fine-tune LLMs and run inference directly within your Kubernetes clusters? Do you want an API compatible with OpenAI to leverage the extensive GenAI ecosystem? If so, LLMariner (https://llmariner.ai) is what you need. It instantly builds a software stack that provides an OpenAI-compatible API for inference, fine-tuning, and model management. In this talk, we'll provide an overview of the LLMariner and showcase its capabilities through practical use cases. Join us to learn how you can leverage LLMariner to enhance Kubernetes for your Generative AI workflows.

Speakers

Kenji Kaneda

Chief Architect, CloudNatix Inc.

Kenji is a chief architect at CloudNatix and has been working on large-scale distributed systems - especially cluster management systems - for over ten years. Most recently, he was a Principal Engineer at Nvidia, responsible for developing their deep learning training platform and... Read More →

transform your kubernetes cluster into genai platform pdf

Tuesday November 12, 2024 5:00pm - 5:10pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Best practices for ML Infrastructure

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

5:15pm MST

⚡ Lightning Talk: Cost Saving Strategies for Interactive AI Development - Shravan Achar, Apple

Tuesday November 12, 2024 5:15pm - 5:25pm MST

Salt Palace | Level 1 | Grand Ballroom A

The interactive nature of Jupyter notebooks has made them indispensable tools for data scientists and AI researchers, facilitating exploratory data analysis, prototyping, and model development. However, managing the cost of resource-intensive computations at different stages of AI/ML lifecycle presents significant challenges. We leveraged Apache YuniKorn to design a resource management system tailored for notebook workloads, which incorporates fair sharing, user-specific policies and budget constraints to allocate computational resources efficiently while adapting for both data preparation and model training stages. And thanks to the extensibility of JupyterLab, we offer rich displays next to the Notebook enabling data scientists to introspect resource usage in real time. This session presents cost saving strategies for interactive development on Jupyter using Kubeflow for model training and Spark for data preparation with YuniKorn scheduler.

Speakers

Shravan Achar

Sr. Software Engineer, Apple

Shravan is a senior software engineer at Apple with a passion for open source technologies. With a background in Mathematics and Computer Science, their current interests include MLOps, Scheduling in AI and Jupyter Notebooks.

AI Day Talk Slides Shravan pdf

Tuesday November 12, 2024 5:15pm - 5:25pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native + Kubernetes AI Day, Best practices for ML Infrastructure

Content Experience Level Intermediate
Event + Breaks Cloud Native + Kubernetes AI Day

5:25pm MST

Cloud Native + Kubernetes AI Day | Closing Remarks - Amber Graner, Rajas Kakodkar, Ricardo Rocha, Program Co-Chairs

Tuesday November 12, 2024 5:25pm - 5:30pm MST

Salt Palace | Level 1 | Grand Ballroom A

Speakers

Cloud Native + Kubernetes AI Day

Content Experience Level Any
Event + Breaks Cloud Native + Kubernetes AI Day

5:30pm MST

Evening Reception (Salt Palace)

Tuesday November 12, 2024 5:30pm - 7:00pm MST

Salt Palace | Level 1 | Lower Concourse + Level 2 | Upper Mezzanine

Join us onsite for drinks and appetizers with fellow co-located attendees from Tuesday's CNCF-hosted Co-located Events.

Attendees from all CNCF Co-located Events are welcome.

Tuesday November 12, 2024 5:30pm - 7:00pm MST
Salt Palace | Level 1 | Lower Concourse + Level 2 | Upper Mezzanine

Breaks

Event + Breaks AppDeveloperCon, ArgoCon, BackstageCon, Cilium + eBPF Day, Cloud Native + Kubernetes AI Day, Platform Engineering Day, Kubernetes on Edge Day, Observability Day, WasmCon, Data on Kubernetes Day, OpenFeature Summit, OpenTofu Day