Add observed status fields to the CQ proposal by norbertcyran · Pull Request #9759 · kubernetes/autoscaler

norbertcyran · 2026-06-05T16:37:58Z

What type of PR is this?

/kind documentation

What this PR does / why we need it:

Added observed* status fields to CapacityQuota proposal.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2026-06-05T16:38:06Z

This issue is currently awaiting triage.

If SIG Autoscaling contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

norbertcyran · 2026-06-05T16:39:33Z

@DerekFrank @jmdeal could you please take a look if this API change makes sense from the Karpenter perspective? Thanks!

norbertcyran · 2026-06-05T17:01:43Z

/assign x13n

x13n · 2026-06-08T07:08:56Z

Can you clarify the use case here? Why wouldn't invalid selectors be validated on admission? Keeping last valid spec as a part of status seems a bit non-standard, I'd like to understand why we need this.

norbertcyran · 2026-06-08T14:12:28Z

@x13n thanks for taking a look!

Can you clarify the use case here?

Other than basic label selector validation, cloud providers might configure some custom validation rules, for example blocklisting certain labels. In some managed k8s services, cloud providers offer pricing per resources requested by pod. One example being GKE Autopilot, another one being EKS Fargate (please correct me if I'm wrong). Using a CapacityQuota with a certain label selector could force the autoscaler to provision a machine that is larger than required for unschedulable pods

Why wouldn't invalid selectors be validated on admission?

They could. Though adding a validation layer on the controller side offers stronger reliability: even if webhook configuration has been removed, controller will still mark an invalid quota as such and will prevent it from being enforced.

Keeping last valid spec as a part of status seems a bit non-standard, I'd like to understand why we need this.

It's also nothing out of ordinary though. For example, ResourceQuota has status.hard mirroring spec.hard, CRD has status.acceptedNames, which by the way has a similar use case, since the controller validates spec.names and populates the status field if spec.names is valid. In CRDs is quite common to see spec.field and status.field/status.observedField pairs

x13n · 2026-06-11T09:34:54Z

Should this be closed in favor of #9788?

norbertcyran · 2026-06-11T10:12:24Z

@x13n to be honest I'd still add the observed fields for observability purposes and preventing races. Let me rephrase the proposal a bit

k8s-ci-robot · 2026-06-11T13:00:34Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: norbertcyran
Once this PR has been reviewed and has the lgtm label, please ask for approval from x13n. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

cluster-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jmdeal · 2026-06-11T15:19:13Z

      memory: 128Gi
      nodes: 50
+  conditions:
+  - type: Valid


Since this is meant to be a shared resource across clusters, how do we think about this validation status condition in multi-tentant clusters? For example, I know of some users running both Karpenter and Cluster Autoscaler in the same cluster (or multiple versions of Karpenter). Having a single validation condition means that multiple autoscalers with conflicting validation policies may fight over this value.

In addition to how we represent this via conditions, what do we think the semantics for validation in multi-tenant clusters should be? Should an autoscaler respect a limit so long as its own validation succeeds, or must all validations succeed for it to be respected by any autoscaler?

Tbh I don't know much about multi-autoscaler setups. How do you route unschedulable pods to the corresponding autoscaler to prevent multiple scale ups?

At the simplest, we could add a label selector option to the controller, so it handles quotas only from a single autoscaler. Or an option to disable the reconciliation in one of the autoscalers, so only one takes ownership of the quotas. Though in such case the chosen owner would have to implement both validation policies

Partitioning pods across autoscalers is currently done with taints and label selectors. That being said, I don't think that's the right solution for this API. If you want to set a restriction on something like the number of a certain instance type, you probably don't want to pre-partition that across autoscalers (e.g. if you allow 10 c5.larges you wouldn't want to say CAS gets 5 and Karpenter gets 5).

I had a couple of ideas of how this could be addressed, depending on the semantics that we want for multi-autoscaler. The cleanest solution from an API standpoint would be for each autoscaler to &= the validation condition for a given spec generation rather than setting it directly. If it fails validation for any autoscaler it would be set to false and subsequent updates could not set it to true until the generation is bumped. There would be a race where, if the first autoscaler marks it as valid and the second does not it could still be temporarily respected by the autoscaler that validated it, but I think that's fine. We could get the same semantics without the race using validating admission policies.

The other solution would be to have a validated condition per autoscaler (e.g. ValidatedCAS, ValidatedAWSKarpenter, ValidatedEKSKarpenter). Each autoscaler would only respect its own validation condition. It would be the responsibility of autoscaler maintainers to ensure keys don't conflict.

It will be used by the backend to enforce the quota

One higher-level question though - if this is being used to enforce the quota rather than an in-memory store, what's the advantage of this approach over VAPs? Users could edit the status directly in the same way that they could remove a VAP or webhook configuration. Would the autoscaler always revalidate the observedSpec before applying the limit? Using VAPs would remove these concerns since it ensures all autoscalers validations have passed before admission.

++ to the notion of users running multiple autoscalers, we are seeing this as well with users wanting to use karpenter and cluster autoscaler in some topologies.

The cleanest solution from an API standpoint would be for each autoscaler to &= the validation condition for a given spec generation rather than setting it directly

It breaks idempotency of the reconciler though. Also as you mentioned, it introduces a race. I think using VAP/webhook are better options for this specific use case

Using condition per autoscaler pollutes the API IMO, though it would work, I think. Autoscalers could accept a config param like --capacity-quota-valid-condition=cloud.google.com/cq-valid.

However, to keep it simple, I think I'd lean towards leaving the ownership of validation to a single, chosen autoscaler. In CAS, we implemented a Validator interface for custom CQ validation options:

autoscaler/cluster-autoscaler/resourcequotas/capacityquota/validator.go

Line 28 in 3d577c6

type Validator interface {

Such a chosen autoscaler would need to implement all validations. All autoscalers would be required to enforce the quota, but only one would be validating it.

if this is being used to enforce the quota rather than an in-memory store, what's the advantage of this approach over VAPs?

I'd say it's a different use case, right now I'd keep the observed fields only to prevent races between enforcement and validation - i.e. the autoscaler would not use a new spec until the reconciler validates it. Their goal would be no longer to protect from manual status updates, since this can't really be fully protected without that in-memory store.

I think we should avoid changing the behavior of the quota depending of the autoscaler that processes it. IMO it can be slightly different from cluster to cluster, but within one cluster, that behavior should remain the same, despite the autoscaler looking at the quota. Having that said, we could drop the valid condition and observed fields and rely on VAPs/webhooks, though that leaves more space for races and bypassing the validation by the cluster admin

I think for the complex setups with multiple node autoscalers and additional constraints on what CapacityQuota objects are allowed, this shouldn't be some consensus between multiple node autoscaler instances, but rather a single policy. The simplest approach would be to implement this on admission. In such case only the intersection of valid configurations would be allowed. Alternatively, cluster admin could decide on their own how to set up the admission. In my mental model, Valid means the object itself is well formed and any additional constraints are only a policy thing, to further restrict some configurations in a specific environment. I believe this is at least the case with Cluster Autoscaler - as far as I can tell, having a correct label selector is pretty much everything that is required. @jmdeal do you foresee any technical limitations that would prevent some versions of Karpenter from correctly handling some CapacityQuota configurations? Or is it only about policy?

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/documentation Categorizes issue or PR as related to documentation. labels Jun 5, 2026

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 5, 2026

k8s-ci-robot requested review from aleksandra-malinowska and elmiko June 5, 2026 16:38

k8s-ci-robot assigned x13n Jun 5, 2026

Add observed status fields to the CQ proposal

f69ba70

norbertcyran force-pushed the cq-observed-limits branch from 99401f3 to f69ba70 Compare June 11, 2026 13:00

jmdeal reviewed Jun 11, 2026

View reviewed changes

Conversation

norbertcyran commented Jun 5, 2026

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

k8s-ci-robot commented Jun 5, 2026

Uh oh!

norbertcyran commented Jun 5, 2026

Uh oh!

norbertcyran commented Jun 5, 2026

Uh oh!

x13n commented Jun 8, 2026

Uh oh!

norbertcyran commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

x13n commented Jun 11, 2026

Uh oh!

norbertcyran commented Jun 11, 2026

Uh oh!

k8s-ci-robot commented Jun 11, 2026

Uh oh!

jmdeal Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

jmdeal Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

norbertcyran Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

jmdeal Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

elmiko Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

norbertcyran Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

x13n Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

norbertcyran commented Jun 8, 2026 •

edited

Loading

norbertcyran Jun 12, 2026 •

edited

Loading