Open Source Summit North America 2025: Full Schedule

June 23 - 25, 2025
Denver, Colorado
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit North America 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Mountain Daylight Time (UTC/GMT -6). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

arrow_back View All Dates

11:55am MDT

Fast Inference, Furious Scaling: Leveraging VLLM With KServe - Rafael Vasquez, IBM

Wednesday June 25, 2025 11:55am - 12:35pm MDT

Bluebird Ballroom 3E

In this talk, we will introduce two open-source projects vLLM and KServe and explain how they can be integrated to leverage better performance and scalability for LLMs in production. The session will include a demo showcasing their integration.

vLLM is a high-performance library specifically designed for LLM inference and serving, offering cutting-edge throughput and efficiency through techniques such as PagedAttention, continuous batching, and optimized CUDA kernels, making it ideal for production environments that demand fast, large-scale LLM serving.

KServe is a Kubernetes-based platform designed for scalable model deployment. It provides robust features for managing AI models in production, including autoscaling, monitoring, and model versioning.

By combining vLLM's inference optimizations with KServe's scalability, organizations can deploy LLMs effectively in production environments, ensuring fast, low-latency inference and seamless scaling across cloud platforms.

Speakers

Rafael Vasquez

Open Source Software Developer, IBM

Rafael Vasquez is a software developer on the Open Technology team at IBM. He previously completed an MASc. working on self-driving car research and transitioned from a data scientist role in the retail field to his current role where he continues to grow his passion for MLOps and... Read More →

vLLM KServe OSS NA25 pdf

Wednesday June 25, 2025 11:55am - 12:35pm MDT
Bluebird Ballroom 3E

Open AI + Data

Audience Experience Level Beginner
Session Slides Yes

2:10pm MDT

Building Your (Local) LLM Second Brain - Olivia Buzek, IBM

Wednesday June 25, 2025 2:10pm - 2:50pm MDT

Bluebird Ballroom 3E

LLMs are hotter than ever, but most LLM-based solutions available to us require you to use models trained on data with unknown provenance, send your most important data off to corporate-controlled servers, and use prodigious amounts of energy every time you write an email.

What if you could design a “second brain” assistant with OSS technologies, that lives on your laptop?

We’ll walk through the OSS landscape, discussing the nuts and bolts of combining Ollama, LangChain, OpenWebUI, CrewAI and Granite models to build a fully local LLM assistant. We’ll also discuss some of the particular complexities involved when your solution involves a local quantized model vs one that’s cloud-hosted.

In this talk, we'll build on the lightning talk to include complexities like:
* how much latency are you dealing with when you're running on a laptop?
* does degradation from working with a 7-8b model reduce effectiveness?
* how do reasoning + multimodal abilities help the assistant task?

Speakers

Olivia Buzek

STSM watsonx.ai - IBM Research, IBM

Olivia has been building machine learning and natural language processing models since before it was cool. She's spent several years at IBM working on opening up Watson tech, around the country and around the world.

Olivia Buzek Building Your (Local) LLM Second Brain (OSS Summit 2025) pptx

Wednesday June 25, 2025 2:10pm - 2:50pm MDT
Bluebird Ballroom 3E

Open AI + Data

Audience Experience Level Intermediate
Session Slides Yes

3:05pm MDT

AI Pipelines With OPEA: Best Practices for Cloud Native ML Operations - Ezequiel Lanza, Intel & Melissa McKay, DevSecOps & MLOps Author/Speaker

Wednesday June 25, 2025 3:05pm - 3:45pm MDT

Bluebird Ballroom 3E

The Open Platform for Enterprise AI (OPEA) is an open source project intended to assist organizations with the realities of enterprise-grade deployments of GenAI apps. Beginning from scratch is a costly endeavor, and the ability to quickly iterate on a solution and determine its viability for your organization is essential to ensure you are making the best moves forward.

During this session, Ezequiel and Melissa will introduce you to the OPEA platform and how to empower your team to build, deploy, and manage AI pipelines more effectively. Attendees will gain insights into best practices for handling complex AI/ML workloads, automating dependency management, and integrating Kubernetes for efficient resource utilization. With a focus on real-world applications, this talk not only showcases the transformative potential of these tools but also encourages attendees to explore new ways to contribute, innovate, and collaborate in driving the future of AI adoption in enterprise environments.

Speakers

Ezequiel Lanza

LF AI & Data TAC Board/Chairperson | Open Source AI Evangelist at Intel, Intel

Passionate about helping people discover the exciting world of artificial intelligence, Ezequiel is a frequent AI conference presenter and the creator of use cases, tutorials, and guides that help developers adopt open source AI tools.

Melissa McKay

DevSecOps & MLOps Author/Speaker

Melissa is passionate about Java, DevSecOps, CI/CD, and MLOps. She has served on the TSC of the Open Platform for Enterprise AI (OPEA) and the CNCF Governing Board, and regularly shares her knowledge with the community as a developer, speaker, and author. She has been recognized as... Read More →

Wednesday June 25, 2025 3:05pm - 3:45pm MDT
Bluebird Ballroom 3E

Open AI + Data

Audience Experience Level Intermediate

4:20pm MDT

Scalable and Efficient LLM Serving With the VLLM Production Stack - Junchen Jiang, University of Chicago & Yue Zhu, IBM Research

Wednesday June 25, 2025 4:20pm - 5:00pm MDT

Bluebird Ballroom 3E

Large Language Models (LLMs) are reshaping how we build applications; however, efficiently serving them at scale remains a major challenge.

The vLLM serving engine, historically focused on single-node deployments, is now being extended into a full-stack inference system through our open-source project, **vLLM Production Stack**. This extension enables any organization to deploy vLLM at scale with high reliability, high throughput, and low latency.
Code: https://github.com/vllm-project/production-stack

At a high level, the vLLM Production Stack project allows users to easily deploy to their Kubernetes cluster through a single command. vLLM Production Stack's optimizations include KV cache sharing to speed up inference (https://github.com/LMCache/LMCache), prefix-aware routing that directs inference queries to vLLM instances holding the corresponding KV caches, and robust observability features for monitoring engine status and autoscaling.

Attendees will discover best practices and see real-time demonstrations of how these optimizations work together to enhance LLM inference performance.

Speakers

Junchen Jiang

Assistant Professor, University of Chicago

Junchen Jiang is an Assistant Professor of CS at the University of Chicago. His research pioneers new approaches to LLM inference systems (https://github.com/vllm-project/production-stack and https://github.com/LMCache/LMCache). He received his Ph.D. from CMU in 2017 and his bachelor’s... Read More →

Yue Zhu

Staff Research Scientist, IBM Research

Yue Zhu is a Staff Research Scientist specializing in foundation model systems and distributed storage systems. Yue obtained a Ph.D. in Computer Science from Florida State University in 2021 and has consistently contribute to sustainability for foundation models and scalable and efficient... Read More →

LMCacheCloud OSS NA25 v2 upload pptx

Wednesday June 25, 2025 4:20pm - 5:00pm MDT
Bluebird Ballroom 3E

Open AI + Data

Audience Experience Level Intermediate
Session Slides Yes

Open Source Summit North America 2025

11:55am MDT

Rafael Vasquez

2:10pm MDT

Olivia Buzek

3:05pm MDT

Ezequiel Lanza

Melissa McKay

4:20pm MDT

Junchen Jiang

Yue Zhu

Get help with the event