fix(read-path): reach complete threads, dense-second timelines, and all people in the GUI#1418
Open
tlongwell-block wants to merge 10 commits into
Open
fix(read-path): reach complete threads, dense-second timelines, and all people in the GUI#1418tlongwell-block wants to merge 10 commits into
tlongwell-block wants to merge 10 commits into
Conversation
Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Threads rendered silently incomplete because the desktop assembled them
from the channel cache instead of fetching the reply subtree. This wires
a first-class server-side thread read (`get_thread_replies` over
`thread_metadata`) through the bridge and a new Tauri command, and fixes
a correctness hole in its pagination.
The pagination bug: `get_thread_replies` keyset-ordered on
`event_created_at` alone, with `WHERE event_created_at > $cursor` and no
tiebreak. Thread replies routinely share a `created_at` second (bursty
threads), so once a page ended mid-second the cursor advanced past the
whole second and every remaining tied reply was silently dropped —
reproduced live on a seeded 40-reply thread spanning two seconds
(24 + 16): paging at limit 10 lost 14 of the 24 tied replies. That is
the exact "missed messages" class this work targets.
Fix (Tier-1, no schema change): composite `(event_created_at, event_id)`
keyset. The query orders and compares on the pair
(`WHERE (event_created_at, event_id) > ($ts, $id)
ORDER BY event_created_at ASC, event_id ASC`), and the cursor carries the
last reply's event id as the tiebreak end to end:
- buzz-db: composite cursor decode (8-byte BE seconds + raw event id) and
row-comparison keyset; a bare 8-byte cursor still works for back-compat.
- buzz-relay bridge: `extract_thread_cursor` reads `thread_cursor` +
`thread_cursor_id` and encodes the composite bytes.
- desktop: `ThreadCursor { created_at, event_id }` request/response type;
the `get_thread_replies` command derives the next cursor from the last
event's `(created_at, id)` transparently — no server-issued token.
Verified end to end against a live relay: paging the seeded thread at
limit 10 returns a set byte-identical to the full-thread oracle (40/40,
no gaps, no duplicates). New DB regression test pins the same-second-tie
case; +5 bridge unit tests cover the composite cursor encoding.
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…the GUI Second half of the Tier-1 read-path work (the backend thread fetch + composite keyset landed in the prior commit). This wires the frontend to those server-side reads and adds a channel-timeline escape hatch, closing the two remaining "missed messages" gaps in the desktop GUI without a schema change or a hot-path rewrite. Threads (descendant gap): the thread panel derives its replies from the channel cache, and `useLoadMissingAncestors` only backfills *ancestors* (walking `e`-tags upward). Replies that fell outside the channel cold-load window were therefore never fetched — deep or old threads rendered silently incomplete. `useThreadReplies` closes this using the same cache seam: on thread open it pages `get_thread_replies` to the floor over the gap-free `(created_at, event_id)` keyset and merges each event into the channel cache (`mergeMessages`, dedupe by id). Idempotent per (channel, root); retries on error rather than caching a partial subtree. All downstream grouping/ordering/unread derivation is unchanged — the thread simply becomes complete. Channel timeline (dense-second wall): scrollback pages history over WS `REQ` with a bare `until` (`created_at`) cursor. Ordinary same-second overlap is duplicate-safe (relay `until` is inclusive), so the timeline does not drop messages in the normal case. The one genuinely unreachable case is a single `created_at` second holding more messages than one WS page: `until` keeps returning that second's newest slice, the boundary never advances, and everything behind it is stuck. `get_channel_messages_before` gives the pager a bridge composite-keyset fallback that advances within a tied second via `id > before_id` under the relay's `created_at DESC, id ASC` order. It activates *only* on a full-page same-second stall (the proven wall), seeds from the max event id already known at the boundary second (not `baseline[0].id`, which would re-request held rows), and drains that entire boundary second before honoring the per-pass row-floor/batch budget — yielding mid-second would strand the tail behind the same stall with no guaranteed sentinel re-arm. Older history then resumes on the normal budget. The WS live-tail/optimistic/cold-load hot path is untouched. Verification: new Playwright parity test seeds a 450-message dense second (>2 WS pages) behind the cold-load window and proves the keyset fallback fires and every row is reachable. It caught a real tail-stranding bug in the first cut (drain honored the batch budget mid-second, stranding 50 of 450 permanently); the boundary-second-drain invariant is the fix. Green: tsc, biome, file-sizes, px-text, cargo check, 1399 unit tests, and the new spec. `tauri.ts`'s existing file-size override is bumped for the two load-bearing bindings (split is known debt, a separate PR). Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Lane 2 (Wren): page user search so add-users / tag-users interactions reach every match instead of a hacky first-page limit. Independent of Lane 1's message/thread read-path (disjoint files); merged to land the read-path work as one PR. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz> * feat/read-path-people-model: test(desktop): cover people search pagination fix(desktop): page user search reachability
Keep the read-path branch mergeable so CI materializes the merge ref. Relay-only change, disjoint from the desktop read-path work. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz> * origin/main: fix(relay): enable Redis TLS for rediss:// (ElastiCache) (#1417)
…eads The channel-timeline dense-second fallback sent its composite keyset tiebreak under the filter key `n`, but the relay bridge's `extract_before_id` reads it from `before_id` (crates/buzz-relay/src/api/ bridge.rs). So on the real `/query` path `query.before_id` was never set: the fallback silently degraded to a bare inclusive `until`, re-returned the same full dense-second page, re-emitted the same cursor, and `drainOlderViaKeyset` could spin inside the boundary-second drain loop instead of breaking the wall — the exact case the escape hatch exists to fix. Caught by Wren in PR review. The Playwright spec passed anyway because its mock-store path reimplements the keyset in JS (never exercising the relay filter field), and config mode sent the same wrong key. Fix both call sites to send `before_id`, and add a Rust unit test that pins the field name so the client/relay contract can't drift silently again — extracting `build_channel_messages_before_filter` to make the emitted filter testable, mirroring `build_search_messages_filter`. Verified: 863/863 desktop cargo tests (incl. the 2 new filter tests), tsc, biome, file-sizes, px-text, 1399 node unit tests, dense-second Playwright. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
The empty-query people directory listing was terminal: it could never page past its first page, so on busy relays not all people were reachable via the DM picker / add-user / tag flows (Tyler's "all people" goal). Two halves, both fixed: - Relay: the non-search general query path (POST /query, no `search` field) built an EventQuery but never read the raw `page` extension, so no SQL OFFSET was set. Only `search`-bearing filters route to handle_bridge_search (which has its own page/per_page). Empty kind:0 listings fell through here. New `extract_page_offset` wires page -> (page-1)*limit into query.offset; query_events already applies OFFSET with deterministic `created_at DESC, id ASC` ordering, so offset paging is stable. Only fires when `page` > 1 is explicitly present, so keyset/general queries are untouched. - Desktop: search_users' empty-query branch called list_user_search_results, which always returned next_cursor: None, so the frontend could never request page 2. Now emits Some(page+1) when the relay returned a full page (raw events.len() >= max), mirroring the existing NIP-50 search branch. Raw length is the correct fullness signal since list_user_search_results dedupes/truncates. Config-mode e2eBridge delegates empty-query paging to the real relay /query, so this fix makes that path correct too; the mock-store path already paged locally. Added Rust unit tests pinning the page->offset contract (absent/page-1 -> None, page N -> (N-1)*limit, missing limit -> None) — the durable server-side guard the JS mock structurally can't verify. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…nings Rust Lint CI runs clippy with -D warnings, so the pre-existing `.as_ref().map(|c| c.as_slice())` on the Lane 1 thread-cursor path (bridge.rs) was a red gate regardless of provenance. `as_deref()` is the exact equivalent (Option<Vec<u8>> -> Option<&[u8]>). Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
The headline Lane-1 thread-backfill fix (get_thread_replies / useThreadReplies) was dead against a real relay: the desktop command built a KINDLESS /query filter, and the relay bridge p-gate (p_gated_filters_authorized) rejects any kindless filter with 403 before routing — a kindless filter 'could match' a p-gated kind, so it demands a #p tag we don't send. The relay routes the thread query purely off #e + depth_limit (kind does not filter it), but the gate still runs first. The e2e mock didn't model p-gating, so CI stayed green on a 403. Fixes: - messages.rs: give get_thread_replies' filter the non-p-gated TIMELINE_KINDS (extracted a shared const + build_thread_replies_filter helper mirroring the channel sibling, so the two filters can't drift). Guard unit tests assert the filter carries kinds and every kind is NOT in P_GATED_KINDS (the real invariant), plus composite-cursor paging. - e2eBridge.ts: model the relay p-gate for /query and WS REQ so a kindless #e read is now rejected in the mock too — closes the CI blind spot. Also fix the mock subtree walk to traverse nested replies transitively. - useThreadReplies.ts: clear the fetched-root mark on cancellation unless the paging loop completed, so an interrupted first fetch retries on reopen instead of silently stranding the feature. Narrow effect deps to primitive channel id/type to avoid object-identity churn. - Correct the false 'includes root at depth 0' contract at all three doc sites (messages.rs / tauri.ts / types.ts): the query keys on root_event_id, which root rows lack, so results are replies-only (depth >= 1). - thread.rs: fix the nested root-stub INSERT in insert_thread_metadata (column list omitted community_id while binding it, scrambling every placeholder) to match the correct production sibling in event.rs. Add depth-2 reachability and a direct stub-branch guard (Postgres-gated). - check-file-sizes.mjs: record load-bearing overrides for messages.rs (fix + guard tests crossed 1000 from 995) and tauri.ts (+3 doc-only lines). Verified: tauri 864/0, buzz-db 124/0 (incl. Postgres, serialized, with a red-check confirming the stub test fails on the pre-fix SQL), and a live clean-room relay at this tip: kindless #e -> 403, shipped kinded filter -> 200 with 250/250 dense same-second depth-1 replies + a depth-2 nested reply reached, root excluded. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
This PR fundamentally fixes how the desktop GUI reaches channel messages, thread replies, and people — removing the "hacky limits" that silently miss messages/people. This is the Tier-1 read-path work: reachability via keyset pagination over the already-merged relay substrate (PR #1321), no schema change, no hot-path rewrite. Two lanes, one PR.
Lane 1 — messages + threads (Eva)
Threads (descendant gap). The thread panel derived replies from the channel cache, and
useLoadMissingAncestorsonly backfilled ancestors. Replies outside the cold-load window were never fetched, so deep/old threads rendered silently incomplete.useThreadRepliescloses it via the same cache seam: pageget_thread_repliesto the floor over the gap-free(created_at, event_id)keyset and merge into the channel cache (dedupe by id, idempotent per root, retry on error). Downstream grouping/ordering/unread unchanged.get_thread_replieskeyset-ordered onevent_created_atalone with no tiebreak, so a page ending mid-second dropped every remaining tied reply. Composite(event_created_at, event_id)keyset closes it. The query keys onroot_event_id(which root rows lack), so results are replies-only (depth >= 1) — root excluded by construction.Channel timeline (dense-second wall). Ordinary same-second overlap is duplicate-safe (relay
untilis inclusive), so the timeline doesn't drop messages normally. The one genuinely unreachable case: a singlecreated_atsecond with more messages than one WS page —untilnever advances, everything behind it is stuck.get_channel_messages_beforegives the pager a composite-keyset fallback that:id > before_idundercreated_at DESC, id ASC),Lane 2 — people search / mentions / members (Wren)
Pages user search so add-users / tag-users / mention interactions reach every match instead of a hacky first-page limit (
user_search.rs,useMentions.ts, members/mention/DM UIs,profile/hooks.ts). Empty-query directory browse is now paged end-to-end (relaypage→offset on the non-search/querypath; desktop emits a next cursor on full raw pages). E2E coverage inchannels.spec.ts+mentions.spec.ts.The blocker a live relay surfaced (and its fix)
Static review passed the thread path, but a dynamic redteam against a real relay caught a production-fatal blocker the mock hid:
get_thread_repliesbuilt a kindless/queryfilter, and the relay bridge p-gate (p_gated_filters_authorized) 403s any kindless filter before routing — a kindless filter "could match" a p-gated kind, so the gate demands a#p=selftag the thread read doesn't send. The relay routes the thread query purely off#e+depth_limit(kind doesn't filter it), but the gate runs first. The e2e mock never modeled p-gating, so CI stayed green on a 403 — every thread-backfill call would have failed in prod.Fixes:
messages.rs— give the thread filter the non-p-gatedTIMELINE_KINDS(via a shared const +build_thread_replies_filterhelper mirroring the channel-messages sibling, so the two p-gate filters can't drift). Guard unit tests assert the filter carries kinds and that every kind is absent fromP_GATED_KINDS(the real invariant), plus composite-cursor paging.e2eBridge.ts— model the relay p-gate for/queryand WS REQ so a kindless#eread is now rejected in the mock too (faithful port ofp_gated_filters_authorized), closing the CI blind spot. Also fix the mock subtree walk to traverse nested replies transitively (frontier BFS for depth > 1).useThreadReplies.ts— clear the fetched-root mark on cancellation unless the paging loop completed, so an interrupted first fetch retries on reopen instead of stranding the feature. Narrow effect deps to primitive channel id/type.messages.rs/tauri.ts/types.ts): results are replies-only.Verification
Live re-verify at the final tip (the gate that caught the blocker): built the relay at this SHA on a clean-room DB, seeded root + 250 dense same-second depth-1 replies + 1 depth-2 grandchild.
#efilter → HTTP 403 (blocker reproduced exactly),Green on the integrated branch:
tsc, biome, file-sizes, px-text,cargo check, unit suites (buzz-db 124/0 incl. Postgres with a red-check proving the stub test fails on the pre-fix SQL; desktop 864/0), the dense-second Playwright spec, and all pre-push CI hooks (desktop-test, rust-tests, mobile-test, desktop-tauri-test).Notes for review
relay_seqwork (Tier-2) — never implemented (spec-only), named as follow-on, not reopened here.messages.rs,tauri.ts) for load-bearing bindings/fixes from both lanes. Splitting the central bindings is known debt / a separate PR — no opportunistic refactor here.scroll-history.spec.ts"preserves user scroll" / "does not teleport" flake on local runners at the clean baseline too (wheel timing); pass under CI's retry policy. Not touched by this change.Final score: 9/10.