Fix off-by-one in dogsdogsdogs lookup_map (index(1) -> index(0))#765
Fix off-by-one in dogsdogsdogs lookup_map (index(1) -> index(0))#765oflatt wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes a panic in dogsdogsdogs::operators::lookup_map caused by reading from a capacity-1 KeyContainer at index(1) instead of index(0), and adds a regression test that exercises the WCOJ extend -> propose/count/validate -> lookup_map path on a simple triangle input.
Changes:
- Fix out-of-bounds
KeyContaineraccess inlookup_mapby switchingindex(1)toindex(0)for bothseek_keyandget_keycomparisons. - Add a regression test that runs a triangle WCOJ dataflow and asserts the expected
(0, 1, 2)result.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
dogsdogsdogs/src/operators/lookup_map.rs |
Fixes the off-by-one key staging bug that causes lookup_map to panic on probes. |
dogsdogsdogs/tests/lookup_map_regression.rs |
Adds an integration regression test that would panic pre-fix and should find the triangle post-fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .inspect_batch(move |_t, xs| { | ||
| let mut v = found_outer.lock().unwrap(); | ||
| for (data, _t, _r) in xs { | ||
| v.push(*data); | ||
| } | ||
| }) |
| //! `lookup_map` stages the probe key into a capacity-1 `KeyContainer`, pushes one | ||
| //! element (valid index 0), and (before the fix) read it back at index 1 — out of | ||
| //! bounds, panicking on *every* probe. This test drives a triangle worst-case-optimal |
a296584 to
45739ae
Compare
|
For context, I'm working on an experimental version of egglog that uses flowlog + dogsdogsdogs as the backing database engine |
|
I can confirm that I found exactly the same problem, and my agent fixed it the same way. Only that I didn't put up a PR :) Thank you! |
|
Yup; looks very plausible. I'm up for merging, but I'm going to need to rethink the testing posture. Much of this is not actively tested / maintained (vs the |
key_con is cleared and has exactly one element pushed (push_own), so the only valid index is 0; index(1) reads past the end and panics on every lookup ("index out of bounds: the len is 1 but the index is 1"). Regressed in TimelyDataflow#628 (removing BatchContainer::borrow_as forced a staged KeyContainer that mis-indexed).
Adds a regression test (dogsdogsdogs/tests/lookup_map_regression.rs): a triangle WCOJ over a one-triangle in-memory graph driving lookup_map via extend; it panics on the unfixed code and passes after the fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
45739ae to
f8d47af
Compare
|
Yeah the test is a small example- I ran into a bug with a much larger egglog test |
dogsdogsdogs::operators::lookup_mapstages the probe key into a capacity-1KeyContainer, pushes one element (valid index 0), but reads it back atindex(1)— out of bounds — for both theseek_keyandget_keycomparisons. This panics on every probe, so anylookup_map-based operator (e.g. the worst-case-optimal delta-query pathextend→propose/count/validate→lookup_map) is unusable.Fix: read the staged key at
index(0).Test: adds
dogsdogsdogs/tests/lookup_map_regression.rs, which drives a triangle worst-case-optimal join over a small in-memory triangle — it panics before the fix and finds(0, 1, 2)after.Found while building a worst-case-optimal join backend on differential dataflow for an e-graph / Datalog engine.
🤖 Generated with Claude Code