Ray Summit 2025

Actors, autoscaling, and production-grade AI

I went in focused on learning more about Ray’s real-world scaling patterns, and left with a clearer picture of how we can apply it to large-scale data + AI workloads.

Met some great folks across open-source, enterprise teams, and early-stage startups to curate these takeaways —

Ray is increasingly being used across the full ML/ RL pipeline (not just training) for data processing, logging, and orchestration.
The community sees Ray not just as a compute framework, but as part of the broader production-AI infrastructure (open source + ecosystem) strategy.
Practical deployment & observability are winning over hype; Ray solves real-world problems (scaling, reliability, cost) rather than purely research or abstraction.

Few sessions stood out that delivered actionable insights —

Ray Agent Engine: Agent Deployment using Ray Serve (Apple) — a deep dive into how they’re using Ray Serve to deploy AI agents with autoscaling, traffic shaping, and framework-agnostic execution. Strong real-world architecture takeaways.
Breaking the Dataset Iteration Bottleneck: Real-Time ML Experimentation with Ray (Pinterest) — great session on accelerating the train/iterate/validate loop by using Ray to remove traditional dataset I/O bottlenecks. Useful for anyone working on high-frequency experimentation or continuous model improvement pipelines.
Revolutionizing Model Serving with a 50× Cost Reduction using Ray Serve (Workday) — compelling case study: serving tens of thousands of models with Ray Serve, with major cost and operational wins. Clear lessons on scaling, optimization, and infra ownership.

The best part of Ray Summit wasn’t the tech alone — it was seeing how people are applying it in production today.

Google Sites

Report abuse