feat(vi/cvi): auto-recover VirtualImage and ClusterVirtualImage from ImageLost#2564
Open
danilrwx wants to merge 1 commit into
Open
feat(vi/cvi): auto-recover VirtualImage and ClusterVirtualImage from ImageLost#2564danilrwx wants to merge 1 commit into
danilrwx wants to merge 1 commit into
Conversation
Restart the import process when a Ready image is lost in DVCR, for recoverable data sources (HTTP, ContainerImage, ObjectRef). Upload images stay in ImageLost since their data cannot be re-fetched. Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
338e9b5 to
9454c18
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Automatically recover
VirtualImageandClusterVirtualImagethat entered theImageLostphase (image is missing in DVCR).Previously the
LifeCycleHandlerdid an unconditional early return forImageLost, so the only way to recover was to delete and recreate the resource. Now, when the image is lost:HTTP,ContainerImage,ObjectRef) the controller resets the status toPending, runssources.CleanUpand requeues, reusing the same import-restart path already used on spec changes;Uploadthe resource stays inImageLost, since the data was uploaded once and cannot be re-fetched.A
VirtualImageLostRecovering/ClusterVirtualImageLostRecoveringevent is emitted on recovery start (arecorderwas added to the CVILifeCycleHandler, which previously had none). TheImagePVCLostphase is not affected.Handler order (
LifeCycleHandler→ImagePresenceHandler) rules out a race: after the phase is reset toPending, the presence handler (which only acts onReady) skips the resource within the same reconcile.Why do we need it, and what problem does it solve?
The image presence monitoring correctly moves VI/CVI to
ImageLostwhen the image disappears from DVCR (manual deletion, storage failure, garbage collection), but the resource then gets stuck there. For most sources the data is fully reproducible: the URL, external registry or referenced cluster resource is still available, so a re-import would restore the image without user action.This is especially valuable for a mass DVCR storage failure, where re-creating dozens of resources by hand is impractical.
What is the expected result?
HTTP/ContainerImage/ObjectRefsource, wait forReady.ImageLost.Pending/Provisioningand returns toReady; aVirtualImageLostRecoveringevent is recorded.Uploadsource, the resource stays inImageLostand no re-import happens.Checklist
Changelog entries