Enhancement Description
Today, any user/controller that wants a set of EndpointSlices for a set of pods matching a label selector, they have 2 options:
- Create a
Service
- Create and update the
EndpointSlice manually as pods matching that selector spin up and down
The latter tends to be avoided by controllers when possible since there is a real impact to scalability when you update endpoints that rapidly across a large cluster. The Kubernetes control plane is already capable of doing this for Service resources, so it stands to reason that this functionality become available to arbitrary consumers as well. Thus, many controllers end up settling for option 1, creating a Service whenever they need EndpointSlices. However, this is not always desirable because Service comes with a bunch of other stuff that you may not want (e.g. a DNS hostname, a frontendIP, etc.). One recent example of this complexity is the InferencePool resource in the Gateway API inference extension. That resource needs to describe a pool of inference endpoints in the cluster; however, using Service to do so would inevitably result in users calling the InferencePool using its hostname/cluster VIP, resulting in kube-proxy load balancing that negates all of the performance benefits of the endpoint picker architecture. And due to only have the aforementioned two options for endpont generation available to them, GAIE implementations end up bearing the complexity of making custom load balancing work.
This KEP is meant to serve as the root of the parallel Gateway API GEP describing this functionality. The idea is that that GEP would be permanently experimental in order to give implementations (and ecosystem subprojects like Kube-Agentic-Networking and WG AI Gateway) the ability to prototype and get feedback before this KEP becomes GA.
This would also give us the opportunity to address some longstanding feature requests like kubernetes/kubernetes#62795 and kubernetes/kubernetes#48528.
/ccing some interested folks: @mikemorris @robscott @LiorLieberman @bowei @aojea @danwinship @youngnick @kflynn @rikatz
Enhancement Description
Today, any user/controller that wants a set of
EndpointSlicesfor a set of pods matching a label selector, they have 2 options:ServiceEndpointSlicemanually as pods matching that selector spin up and downThe latter tends to be avoided by controllers when possible since there is a real impact to scalability when you update endpoints that rapidly across a large cluster. The Kubernetes control plane is already capable of doing this for
Serviceresources, so it stands to reason that this functionality become available to arbitrary consumers as well. Thus, many controllers end up settling for option 1, creating a Service whenever they needEndpointSlices. However, this is not always desirable becauseServicecomes with a bunch of other stuff that you may not want (e.g. a DNS hostname, a frontendIP, etc.). One recent example of this complexity is theInferencePoolresource in the Gateway API inference extension. That resource needs to describe a pool of inference endpoints in the cluster; however, usingServiceto do so would inevitably result in users calling the InferencePool using its hostname/cluster VIP, resulting in kube-proxy load balancing that negates all of the performance benefits of the endpoint picker architecture. And due to only have the aforementioned two options for endpont generation available to them, GAIE implementations end up bearing the complexity of making custom load balancing work.This KEP is meant to serve as the root of the parallel Gateway API GEP describing this functionality. The idea is that that GEP would be permanently experimental in order to give implementations (and ecosystem subprojects like Kube-Agentic-Networking and WG AI Gateway) the ability to prototype and get feedback before this KEP becomes GA.
This would also give us the opportunity to address some longstanding feature requests like kubernetes/kubernetes#62795 and kubernetes/kubernetes#48528.
/ccing some interested folks: @mikemorris @robscott @LiorLieberman @bowei @aojea @danwinship @youngnick @kflynn @rikatz
ServiceresourceService#6127k/enhancements) update PR(s):k/k) update PR(s):k/website) update PR(s):