Skip to content

fix(worker): windowed per-tenant quota + restore #230 (end webhook 429 lockout)#233

Merged
liplus-lin-lay merged 2 commits into
mainfrom
fix/quota-windowed-reset
Jun 22, 2026
Merged

fix(worker): windowed per-tenant quota + restore #230 (end webhook 429 lockout)#233
liplus-lin-lay merged 2 commits into
mainfrom
fix/quota-windowed-reset

Conversation

@liplus-lin-lay

Copy link
Copy Markdown
Member

概要

#231真因の修正。webhook が 06-21 14:27 JST 以降 全件 429 だったのは、署名(#230)でも CF プラン/枠でもなく、TenantRegistry DO の per-tenant quota(quotas.events_stored)が単調増加の生涯カウンタで、1万到達後は永久に 429だったため(DO 永続なのでデプロイをまたいでも残る)。署名失敗なら 403 のはず=実機 429 と矛盾、で確定。

変更

  1. quota を 1 時間の窓付きスループットにworker/src/tenant.ts)。/quota-check は前の窓が経過 or 未設定なら新しい窓を開く(events_stored=1, window_started_at=now)。生涯ロックを構造的に除去+既存テナントは window_started_at=0 が経過窓と解釈され次チェックで自動復旧(=本番の止血)。スキーマ移行(ALTER TABLE ADD COLUMN)込み。
  2. fix(worker): constant-time signature verification (patch) #230 を un-revertworker/src/signature.ts + test)。fix(worker): constant-time signature verification (patch) #230 は無実だったのに Revert #230 (constant-time signature swap) to restore live webhook delivery #232 で誤って revert し timing side-channel を開けてしまったので、constant-time 署名検証を復活
  3. テストtenant.test.ts): 窓内 enforce(limit で 429)+ stale 0 窓 → reset(永久ロックしない, bug(worker): #230 signature swap broke live webhook delivery - all deliveries 429, intake silent #231 を追加。

影響スコープ

patch。webhook 復旧が目的。

確認

CI 緑だけで完了としない#230 の教訓)。マージ → Workers Builds 自動デプロイ後、実イベントのプローブで 配送 2xx 化 + get_pending_status 着弾を実機確認する。

Refs #231 / restores #230 / supersedes the #232 revert

…231)

The per-tenant quota counter (TenantRegistry DO, quotas.events_stored) only
ever incremented and was never reset, so once a tenant's lifetime ingested-
event count reached events_limit (10000) every subsequent non-installation
webhook was rejected with 429 permanently — surviving worker redeploys because
the count lives in DO storage. This is the real cause of the 2026-06-21 webhook
outage (not #230: a broken signature returns 403, but deliveries were 429).

Make the quota a 1h windowed throughput counter: /quota-check starts a fresh
window (events_stored=1, window_started_at=now) whenever the previous window
elapsed or was never set. Migrated rows have window_started_at=0, read as an
elapsed window, so the locked production tenant self-clears on the next check.

per-tenant quota を時間窓化し、生涯累積による永久 429 ロックを構造的に除去する。
既存テナントは window_started_at=0 が経過窓と解釈され、次チェックで自動復旧する。

Refs #231
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
github-webhook-mcp b85fea5 Jun 22 2026, 08:10 AM

@liplus-lin-lay liplus-lin-lay left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self-review(AI): 真因(quota 生涯カウンタの永久ロック)に対する窓化修正+#230 復活。CI 全緑(新規 quota テスト= 窓内 enforce / stale 0窓 reset 含む)。スコープ = tenant.ts(窓化+ALTER 移行)/ tenant.test.ts / signature.ts(+test, #230 復活)。release type = patch(webhook 復旧目的)。マージ後に Workers Builds デプロイ→実機プローブで配送2xx化+get_pending_status 着弾を確認する(CI 緑で完了としない、#230 の教訓)。移行は ensureTables の ALTER ADD COLUMN で既存 quotas 行に window_started_at=0 を付与→次 /quota-check で本番テナント自動復旧、を想定。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant