In a busy Kubernetes cluster, the API server is constantly bombarded with requests. Without proper request management, a massive traffic spike or one buggy client can overwhelm the "brain" of your cluster, potentially causing a total outage.
This talk introduces API Priority and Fairness (APF), the built-in system that keeps your cluster stable under pressure. We will break down how APF replaces "all-or-nothing" limits with a smarter approach:
- Priority Levels: How critical calls such as leader elections, leases, etc., get a "fast lane" while other tasks are queued.
- Noisy Neighbors: How APF uses Shuffle Sharding to ensure one buggy client doesn't crash the system for everyone else.
- Fairness: How the server decides who gets to go next when resources are tight.
In this talk, we’ll walk through what APF solves, how concurrency and queuing work under the hood. By the end, you’ll understand why APF matters and how it protects the API server.



