Apache Pinot is a distributed, real-time OLAP (Online Analytical Processing) store built and open-sourced by LinkedIn. At LinkedIn, Pinot powers a wide range of real-time analytics use cases, supporting both site-facing and internal decision-making platforms. It provides sub-second query latencies (P99 < 1 second for many use cases) and processes approximately 1 trillion queries per month. LinkedIn’s production Pinot fleet operates at scale, spanning ~14,000 hosts, managing over 600 tables and 4 petabytes of data across multiple data centers, and supporting both stateful and stateless services.
LinkedIn recently migrated its production Apache Pinot fleet from on-premises bare-metal hardware to Kubernetes with zero downtime. This tech talk will explore the technical journey, focusing on design choices, the challenges and trade-offs faced, and a balance of building custom tools versus leveraging existing solutions.
Key topics include:
- Overview of the distributed Pinot OLAP database
- Kubernetes migration requirements and design considerations
- Strategy for availability zone-aware shard placement
- Containerization challenges (e.g., hardware SKU management, OS kernel configurations, and handling stateful vs. stateless components)
- Implementation of a no-downtime migration orchestrator
- Evaluating in-place vs. out-of-place migration options
- Leveraging Airflow and Temporal for automated OLAP table migrations and performance testing
This session will offer valuable insights for organizations managing stateful database systems, sharing lessons learned and strategies to ensure successful migrations with minimal disruption to internal and external users.



