Skip to content

Split CapacityQuota usage and validation reconcilers#9800

Open
norbertcyran wants to merge 1 commit into
kubernetes:masterfrom
norbertcyran:cq-split-controllers
Open

Split CapacityQuota usage and validation reconcilers#9800
norbertcyran wants to merge 1 commit into
kubernetes:masterfrom
norbertcyran:cq-split-controllers

Conversation

@norbertcyran

Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

CapacityQuota controller right now does two steps: validating the quota and calculating usages. Calculating usages is a way more complex task, since it iterates over all nodes in the cluster, while validation only validates the spec of a single CapacityQuota. Those steps also have different triggers - validation should run on every update of CapacityQuota, even a status update in case of manual updating the status by the user. Because of that, the validation reconciler will always run twice in a row, which we definitely want to avoid in case of the usage calculation. Usage reconciler needs to watch Nodes and CapacityQuota spec changes, but it does not need to watch CapacityQuota status changes. Another benefit of splitting the controllers is the fact that they will have separate queues, so the validation can keep running even if the usage reconciler queue is overloaded.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 11, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If SIG Autoscaling contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added area/cluster-autoscaler Issues or PRs related to the Cluster Autoscaler component do-not-merge/needs-area Indicates that a PR should not merge because it lacks an area label. labels Jun 11, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: norbertcyran
Once this PR has been reviewed and has the lgtm label, please assign towca for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/needs-area Indicates that a PR should not merge because it lacks an area label. label Jun 11, 2026
@k8s-ci-robot k8s-ci-robot requested a review from mtrqq June 11, 2026 12:30
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 11, 2026
CapacityQuota controller right now does two steps: validating the quota
and calculating usages. Calculating usages is a way more complex task,
since it iterates over all nodes in the cluster, while validation only
validates the spec of a single CapacityQuota. Those steps also have
different triggers - validation should run on every update of
CapacityQuota, even a status update in case of manual updating the
status by the user. Because of that, the validation reconciler will
always run twice in a row, which we definitely want to avoid in case of
the usage calculation. Usage reconciler needs to watch Nodes and
CapacityQuota spec changes, but it does not need to watch CapacityQuota
status changes. Another benefit of splitting the controllers is the fact
 that they will have separate queues, so the validation can keep running
  even if the usage reconciler queue is overloaded.
@norbertcyran norbertcyran force-pushed the cq-split-controllers branch from 885f82a to 40acb4a Compare June 11, 2026 12:31
@norbertcyran

Copy link
Copy Markdown
Contributor Author

/assign x13n

@rrangith rrangith left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

one small nit, otherwise looks good!

}
}

// Reconcile reconciles CapacityQuota's current usage..

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Reconcile reconciles CapacityQuota's current usage..
// Reconcile reconciles CapacityQuota's current usage.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cluster-autoscaler Issues or PRs related to the Cluster Autoscaler component cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants