Skip to content

systemd: improve robustness of certificate renewals#55

Merged
byKryptogram merged 1 commit into
mainfrom
mattias/renewals
Jul 3, 2026
Merged

systemd: improve robustness of certificate renewals#55
byKryptogram merged 1 commit into
mainfrom
mattias/renewals

Conversation

@byKryptogram

@byKryptogram byKryptogram commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

For background see conversation in the chat

Summary by CodeRabbit

  • Bug Fixes
    • Improved renewal scheduling so it runs at a specific weekly time and includes a short delay after boot.
    • Added automatic retry behavior if renewal fails, with controlled backoff to reduce repeated failures.

@byKryptogram byKryptogram requested a review from a team as a code owner July 3, 2026 09:22
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@byKryptogram, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 5 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77bdbabf-e65d-4b34-b6eb-499a4000cb9f

📥 Commits

Reviewing files that changed from the base of the PR and between dfe7990 and adfb434.

📒 Files selected for processing (4)
  • deb/usr/lib/systemd/system/dnstapir-renew.service
  • deb/usr/lib/systemd/system/dnstapir-renew.timer
  • rpm/SOURCES/dnstapir-renew.service
  • rpm/SOURCES/dnstapir-renew.timer
📝 Walkthrough

Walkthrough

Systemd unit files for dnstapir-renew are updated across deb and rpm packaging trees. Service files gain start-limit and restart-on-failure directives; timer files replace weekly calendar scheduling with a boot delay plus a fixed Monday 12:00 Europe/Stockholm schedule.

Changes

Systemd service and timer configuration

Layer / File(s) Summary
Service restart and rate-limit directives
deb/usr/lib/systemd/system/dnstapir-renew.service, rpm/SOURCES/dnstapir-renew.service
Adds StartLimitIntervalSec=75h and StartLimitBurst=72 in [Unit], and Restart=on-failure with RestartSec=1h in [Service].
Timer scheduling update
deb/usr/lib/systemd/system/dnstapir-renew.timer, rpm/SOURCES/dnstapir-renew.timer
Replaces OnCalendar=weekly with OnBootSec=5m and OnCalendar=Mon *-*-* 12:00:00 Europe/Stockholm.

Estimated code review effort: 1 (Trivial) | ~5 minutes

Poem

A rabbit checks the clock with glee,
Monday noon in Stockholm's tree,
Restart on failure, wait an hour,
No more storms of endless power,
Hop hop hooray for steady time! 🐇⏰

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: systemd unit updates to make certificate renewal more robust.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
deb/usr/lib/systemd/system/dnstapir-renew.service (1)

4-5: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Document the retry-budget rationale.

StartLimitBurst=72 with RestartSec=1h yields ~72h of hourly retries inside the 75h StartLimitIntervalSec window — a reasonable "keep retrying for ~3 days" policy, but the numbers are non-obvious without context. A short comment above these directives would help future maintainers understand the intended retry budget instead of having to reverse-engineer it.

Also applies to: 16-17

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deb/usr/lib/systemd/system/dnstapir-renew.service` around lines 4 - 5, Add a
brief inline comment near the dnstapir-renew.service restart-limit settings to
document the intended retry budget: explain that StartLimitBurst=72 with hourly
restarts gives roughly 72 hours of retries within the 75-hour
StartLimitIntervalSec window. Place it above the related
StartLimitIntervalSec/StartLimitBurst directives so maintainers can understand
the rationale without reverse-engineering the values.
deb/usr/lib/systemd/system/dnstapir-renew.timer (1)

9-10: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Consider Persistent=true for missed-run recovery.

The stated goal is robustness of renewals, but the timer still lacks Persistent=true. If the host is powered off across the Monday 12:00 window and doesn't reboot in the interim, the weekly run is simply skipped until the next Monday. OnBootSec=5m only covers hosts that reboot; always-on hosts down during the exact window get no catch-up run.

Proposed addition
 [Timer]
 OnBootSec=5m
 OnCalendar=Mon *-*-* 12:00:00 Europe/Stockholm
+Persistent=true
 AccuracySec=1h
 RandomizedDelaySec=100min
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deb/usr/lib/systemd/system/dnstapir-renew.timer` around lines 9 - 10, The
timer definition for dnstapir-renew should be updated to handle missed weekly
runs by enabling Persistent=true alongside the existing OnBootSec and OnCalendar
settings. Locate the systemd timer unit and add the persistent missed-run
recovery option so the renew job is triggered after downtime even if the Monday
12:00 window was missed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@deb/usr/lib/systemd/system/dnstapir-renew.service`:
- Around line 4-5: Add a brief inline comment near the dnstapir-renew.service
restart-limit settings to document the intended retry budget: explain that
StartLimitBurst=72 with hourly restarts gives roughly 72 hours of retries within
the 75-hour StartLimitIntervalSec window. Place it above the related
StartLimitIntervalSec/StartLimitBurst directives so maintainers can understand
the rationale without reverse-engineering the values.

In `@deb/usr/lib/systemd/system/dnstapir-renew.timer`:
- Around line 9-10: The timer definition for dnstapir-renew should be updated to
handle missed weekly runs by enabling Persistent=true alongside the existing
OnBootSec and OnCalendar settings. Locate the systemd timer unit and add the
persistent missed-run recovery option so the renew job is triggered after
downtime even if the Monday 12:00 window was missed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a985e131-2541-402a-9e62-a58fe79319e7

📥 Commits

Reviewing files that changed from the base of the PR and between 071d720 and dfe7990.

📒 Files selected for processing (4)
  • deb/usr/lib/systemd/system/dnstapir-renew.service
  • deb/usr/lib/systemd/system/dnstapir-renew.timer
  • rpm/SOURCES/dnstapir-renew.service
  • rpm/SOURCES/dnstapir-renew.timer

@byKryptogram

byKryptogram commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Switching to using Persistent=true instead of OnBootSec=5m to handle renewals upon boot if it is necessary according to schedule.

@oej oej left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super

@byKryptogram byKryptogram merged commit adfb434 into main Jul 3, 2026
3 checks passed
@byKryptogram byKryptogram deleted the mattias/renewals branch July 3, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants