diff --git a/CHANGELOG.md b/CHANGELOG.md
index 4644bab5..dd4a67df 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,8 +9,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Added
 - **`generate_synthetic_control_data()` data generator + a capstone `SyntheticControl` tutorial.** New public generator (`diff_diff/prep_dgp.py`, exported from `diff_diff`) builds a **single-treated-unit** factor-model panel for synthetic-control demos and tests: one treated unit whose latent factor loadings and baseline are an exact convex combination of a few donors (so the noiseless trajectory lies in the donor convex hull and a synthetic control reproduces it closely — the observed fit is approximate under added noise), persistent AR(1) factors, predictor covariates that each proxy a distinct factor, a common calendar time effect, and a known `"ramp"` or `"constant"` treatment effect emitted as `true_effect`. Tutorial **`docs/tutorials/25_synthetic_control_policy.ipynb`** walks the whole `SyntheticControl` surface end-to-end on a policy-evaluation story (one state adopts a clean-energy standard), structured around **two inference philosophies**: cross-unit permutation (`in_space_placebo` + Firpo–Possebom `confidence_set`, with `leave_one_out` / `in_time_placebo` robustness) versus over-time conformal (CWZ `conformal_test` / `conformal_confidence_intervals` / `conformal_average_effect`), with the per-period conformal band as the climax. A `tests/test_t25_synthetic_control_policy_drift.py` drift guard re-derives every quoted number from the generator.
+- **`TwoStageDiD` methodology validation (Gardner 2022 / `did2s`).** New `tests/test_methodology_two_stage.py` with paper-equation-numbered Verified Components (§3 two-stage procedure / eqs. 4 & 6, §3.3 GMM variance, fn. 19 always-treated exclusion, Proposition 5, covariate path, `balance_e`, `vcov_type` narrowing) plus a `did2s::did2s()` cross-language parity fixture (`benchmarks/R/generate_did2s_golden.R` → `benchmarks/data/did2s_golden.json` + `did2s_test_panel.csv`), pinning overall + event-study ATT (abs 1e-6) and SE (abs 1e-7). `METHODOLOGY_REVIEW.md` `TwoStageDiD` row flipped to **Complete**.
 
 ### Fixed
+- **`TwoStageDiD` analytical GMM standard errors are now exact (match R `did2s` to ~1e-7).** The Gardner two-stage GMM sandwich `_compute_gmm_variance` derived its residuals from the *iterative* alternating-projection first-stage fixed effects (`_iterative_fe`, which converge only to ~1e-7 on unbalanced untreated panels) while computing `gamma_hat` exactly — leaving the variance ~1% off the analytical sandwich. The variance now re-solves the Stage-1 FE **exactly** (sparse OLS, reusing the `gamma_hat` factorization), and `_build_fe_design` gained an intercept column so its column space spans the grand mean (the prior intercept-free design omitted it, and the exact residual is first-order sensitive to it). Unidentified-FE obs (rank-deficient / Proposition-5) fall back to the iterative residual, so those edge cases are unchanged; the reported `overall_att` still uses the iterative FE (point-estimate equivalence with `ImputationDiD` preserved). Mirrors the same-class fix already applied to `ImputationDiD`'s exact-sparse variance.
 - **`LinearRegression.get_se()` / `get_inference()` no longer return a `NaN` standard error from a tiny-negative variance artifact.** A high-leverage / degenerate coefficient (e.g. an absorbed-FE dummy near-collinear with the treatment, whose Bell-McCaffrey Satterthwaite DOF already hits the noise-floor guard) can have a CR2/HC variance of ~0 (≈1e-32) whose vcov diagonal lands just-below-zero under BLAS-dependent float rounding; `np.sqrt` of the negative then produced a `NaN` SE **nondeterministically** — passing single-threaded but failing under the parallel pure-Python full-suite run (`tests/test_methodology_wls_cr2.py::TestLinearRegressionFENanGuardEndToEnd::test_did_absorbed_fe_lr_inference_nan_for_guarded_coefs`). Both SE sites now clamp the vcov diagonal at 0, so the SE is finite (0 for a genuinely-zero variance), deterministic, and BLAS-independent. **No change for any positive variance** (the clamp is a no-op there); only the previously-`NaN` degenerate case is affected.
 
 ## [3.5.2] - 2026-06-08
diff --git a/METHODOLOGY_REVIEW.md b/METHODOLOGY_REVIEW.md
index 9f9611b8..74ce9f7c 100644
--- a/METHODOLOGY_REVIEW.md
+++ b/METHODOLOGY_REVIEW.md
@@ -48,7 +48,7 @@ The catalog grew incrementally over several quarters, so formats vary across the
 | SunAbraham | `sun_abraham.py` | `fixest::sunab()` | **Complete** | 2026-02-15 |
 | StackedDiD | `stacked_did.py` | `stacked-did-weights` (Wing-Freedman-Hollingsworth code) | **Complete** | 2026-02-19 |
 | ImputationDiD | `imputation.py` | `didimputation` | **Complete** | 2026-06-06 |
-| TwoStageDiD | `two_stage.py` | `did2s` | **In Progress** | — |
+| TwoStageDiD | `two_stage.py` | `did2s` | **Complete** | 2026-06-23 |
 | WooldridgeDiD (ETWFE) | `wooldridge.py` | `etwfe` (R) / `jwdid` (Stata) | **Complete** | 2026-05-22 |
 | EfficientDiD | `efficient_did.py` | (no canonical R package) | **Complete** | 2026-06-01 |
 
@@ -583,19 +583,36 @@ and covariate-adjusted specifications.)
 | Module | `two_stage.py`, `two_stage_bootstrap.py` |
 | Primary Reference | Gardner (2022), *Two-stage differences in differences*, arXiv:2207.05943 |
 | R Reference | `did2s` |
-| Status | **In Progress** |
-| Last Review | — |
+| Status | **Complete** |
+| Last Review | 2026-06-23 |
 
-**Documentation in place:**
-- REGISTRY.md section: `## TwoStageDiD` (Stage 1 unit+time FE on untreated, Stage 2 OLS on residualized outcomes, GMM sandwich variance per Newey-McFadden Theorem 6.1)
-- Paper review: `docs/methodology/papers/gardner-2022-review.md` (PR-A — eq./section-numbered review of arXiv:2207.05943; corrected a fabricated Eq. 6 variance deviation, see "Documented alignment" below)
-- Implementation: 76 unit tests in `tests/test_two_stage.py` (matches ImputationDiD point estimates, R `did2s` global `(D'D)^{-1}` variance, always-treated unit exclusion, multiplier bootstrap)
-- Documented alignment: variance = global `(D'D)^{-1}` GMM sandwich (Newey-McFadden Theorem 6.1, Gardner §3.3) — **faithful to both the paper and `did2s`**. Gardner eq. (6) is the *event-study regression spec*, not a variance formula; the earlier "matches `did2s`, not paper Eq. 6" / "Newey-McFadden sandwich vs paper's Eq. 6 deviation" framing was a misattribution, corrected in PR-A across `REGISTRY.md` + the paper review.
+**Verified Components** (`tests/test_methodology_two_stage.py`, paper-section/equation-numbered):
+- **§3, eqs. (4)/(6) — the two-stage procedure:** Stage 1 unit+time FE on the untreated set Ω₀, Stage 2 on the residualized outcome recovers the constant overall ATT (eq. 4) and heterogeneous-by-horizon effects (Step 2′ / eq. 6); perturbing a treated outcome shifts the overall ATT by exactly δ/N_treated (treated obs never feed Stage 1); coincides with `ImputationDiD` to 1e-10; the live `covariates=` (fn. 9 in-both-stages) path recovers the ATT under a treatment-confounded covariate — `TestGardner2022Section3TwoStageProcedure`
+- **§3.3 + Appendix B — joint-GMM Newey-McFadden Thm 6.1 variance:** finite/positive SEs; the first-stage correction `γ̂'c_g` makes the GMM SE strictly exceed the no-first-stage floor; a fixed-seed SE regression-pin locks the global-inverse + no-FSA convention; `cluster=None` clusters at the unit; a rank-deficient Ω₀ warns and falls back to dense lstsq; `pretrends=True` leads have finite SEs — `TestGardner2022Section33GMMVariance`
+- **fn. 19 + Proposition 5 (Borusyak et al. 2024):** always-treated units dropped with a warning (treated-unit count falls exactly); no never-treated ⇒ horizons h ≥ h̄ NaN with n_obs>0 + warning; `balance_e` with no qualifying cohort warns + reference-period-only; zero-obs cohorts NaN — `TestGardner2022Identification`
+- **Library extensions:** multiplier bootstrap on the GMM influence function; `vcov_type ∈ {classical, hc2, hc2_bm}` rejected (no cross-stage hat matrix) — `TestGardner2022LibraryDeviations`
+- **R parity vs `did2s::did2s` (v1.2.1):** overall + event-study ATT and SEs match on a fixed-seed staggered panel (analytical corrected clustered SE, `bootstrap=FALSE`); tests assert ATT `abs=1e-6`, SE `abs=1e-7` — `TestTwoStageDiDParityR` (goldens: `benchmarks/data/did2s_golden.json`, generator: `benchmarks/R/generate_did2s_golden.R`)
 
-**Outstanding for promotion:**
-- Dedicated `tests/test_methodology_two_stage.py` with paper-equation-numbered Verified Components walk-through
-- R parity benchmark fixture against `did2s` (none on file)
-- "Corrections Made" listing + flip Status → Complete (PR-B)
+**Corrections Made** (PR-B, surfaced by the `did2s` R parity):
+- **Exact Stage-1 residuals in the GMM variance.** `_compute_gmm_variance` derived its residuals from the *iterative* alternating-projection FE (`_iterative_fe`, ~1e-7 convergence on unbalanced Ω₀) while computing `gamma_hat` exactly — leaving the analytical SE ~1% off the exact sandwich. The variance now re-solves the Stage-1 FE **exactly** (sparse OLS, reusing the `gamma_hat` factorization); the reported `overall_att` still uses the iterative FE (twin-equivalence preserved at 1e-10). Unidentified-FE obs (rank-deficient / Prop-5) fall back to the iterative residual.
+- **Intercept added to `_build_fe_design`.** The prior `[unit_1.., time_1..]` layout (drop first unit + first time, **no intercept**) did not span the constant / grand mean; the exact residual is first-order sensitive to it. Added an intercept column → standard full-rank two-way FE (matches fixest / `did2s`). Same-class fix as ImputationDiD's PR-B FE-design correction. SE now matches `did2s` to ~1e-9.
+
+**Deviations from the reference / library extensions** (see `REGISTRY.md` `## TwoStageDiD`):
+- **Deviation from R:** the `did2s` analytical GMM sandwich uses **no finite-sample multiplier** (meat `= S'S`); the rendered `CR1` label carries no Stata `(n-1)/(n-p)` or `G/(G-1)` factor (matches `did2s`; same FSA-free convention as ImputationDiD's Theorem-3 variance).
+- Multiplier bootstrap on the GMM influence function (library extension; Gardner prescribes analytical GMM SEs only; `did2s` defaults `bootstrap=FALSE`).
+- ⚠️ Paper-permitted but **not exposed**: the Eq. (5) P̄-period-average estimand and the fn. 8 full-sample first-stage variant (tracked in `TODO.md`).
+- `vcov_type` narrowed to the GMM sandwich (`{classical, hc2, hc2_bm}` rejected); `vcov_type="conley"` deferred (TODO).
+
+**R Comparison Results** (`did2s` v1.2.1, fixed-seed panel, `benchmarks/data/did2s_golden.json`):
+
+| Quantity | Python | R | |Δ| |
+|----------|--------|---|-----|
+| Overall ATT | 2.04566790 | 2.04566803 | 1.3e-7 |
+| Overall SE | 0.02813225 | 0.02813225 | 1.0e-9 |
+| Event-study ATT (max) | — | — | 1.6e-7 |
+| Event-study SE (max) | — | — | 3.7e-10 |
+
+(Point-estimate Δ is iterative-FE convergence level; SE Δ is machine precision after the exact Stage-1 re-solve. The parity tests assert ATT `abs=1e-6`, SE `abs=1e-7` for cross-platform robustness.)
 
 ---
 
@@ -1447,7 +1464,6 @@ Promotion priority for the **In Progress** entries, ordered by what's blocked on
 **Substantive-review-blocked (each still missing one or more of: a methodology test file, R parity, or a paper review):**
 
 1. **PlaceboTests** — decide first whether to keep standalone or absorb into per-estimator diagnostic sections; methodologically lightweight either way.
-2. **TwoStageDiD** — the remaining half of the imputation pair (ImputationDiD is now Complete, validated against `didimputation`). Gardner (2022) paper review **landed** (`docs/methodology/papers/gardner-2022-review.md`, PR-A); still needs `tests/test_methodology_two_stage.py` and an R parity fixture against `did2s` to flip to Complete (PR-B).
 
 **Consolidation-pass-blocked (already has paper review or methodology file or R parity; mostly Verified Components walk-through):**
 
diff --git a/TODO.md b/TODO.md
index 045f6fe1..aaa35ee4 100644
--- a/TODO.md
+++ b/TODO.md
@@ -153,6 +153,7 @@ Deferred items from PR reviews that were not addressed before merge.
 | `SpilloverDiD` data-driven `d_bar` selection (Butts 2021b / Butts 2023 JUE Insight cross-validation). | `spillover.py::SpilloverDiD` | follow-up | Low |
 | `SpilloverDiD` sparse cKDTree path for the staggered nearest-treated-distance helper (mirrors the static helper's sparse branch). Currently `_compute_nearest_treated_distance_staggered` always builds dense `(n_units, n_treated_by_onset)` pairwise distance matrices per cohort; on large staggered panels with many cohorts this is avoidable memory/runtime. Add a sparse k-d-tree branch analogous to `_compute_nearest_treated_distance_sparse`, gated on `n > _CONLEY_SPARSE_N_THRESHOLD`. | `spillover.py::_compute_nearest_treated_distance_staggered` | follow-up (Wave B) | Low |
 | `SpilloverDiDResults` in `DiagnosticReport` dispatch tables. Wave C event-study emits a TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias that `plot_event_study` consumes via the new `reference_period` attribute fallback in `_extract_plot_data`, but `SpilloverDiDResults` is NOT registered in `DiagnosticReport`'s `_APPLICABILITY` / `_PT_METHOD` tables — so `DiagnosticReport(spillover_result)` doesn't currently route to event-study diagnostics. Registering requires (a) deciding which diagnostics apply (parallel trends, pre-trends power, heterogeneity, design-effect) AND (b) adding an end-to-end test. | `diff_diff/diagnostic_report.py::_APPLICABILITY`, `_PT_METHOD` | follow-up (Wave C) | Low |
+| `TwoStageDiD` paper-permitted estimand variants not exposed (Gardner 2022): the **Eq. (5) P̄-period-average** estimand (duration-restricted Stage-2 sample) and the **fn. 8 full-sample first-stage** variant (treatment-status×period interactions in Stage 1). Both are valid modifications described in the paper but have no public parameter (`get_params()` exposes neither; Stage 1 is always untreated-only). Documented as ⚠️ in `docs/methodology/papers/gardner-2022-review.md`; surface as `estimand=`/`first_stage=` options if a use case arises. | `diff_diff/two_stage.py` | two-stage-validation follow-up | Low |
 
 #### Performance
 
@@ -173,7 +174,6 @@ Deferred items from PR reviews that were not addressed before merge.
 |-------|----------|----|----------|
 | Drift test for tutorial 24 qualitative power claims (monotonic dilution fast→slow; CS-vs-2×2 MDE crossover/near-parity at slow rollout) — pins the prose against estimator-default/simulation drift | `docs/tutorials/24_staggered_vs_collapsed_power.ipynb` | staggered-analysis-2x2 | Low |
 | ImputationDiD covariate-path variance lacks dedicated R `didimputation` parity / hand-calc. The PR-B FE-design correction (keep all unit dummies) affects the covariate projection too, but only the no-covariate staggered panel is R-parity'd (the covariate path shares the same validated projection code and passes the full suite). Add a covariate (time-varying X) R golden asserting overall/event-study SE parity, or a small dense-design hand-calc for the covariate projection. | `tests/test_methodology_imputation.py`, `benchmarks/R/generate_didimputation_golden.R` | imputation-validation follow-up | Low |
-| TwoStageDiD methodology validation PR-B: add `tests/test_methodology_two_stage.py` (eq./section-numbered Verified Components — Stage-1 FE recovery on untreated obs; Stage-2 overall ATT eq. 4 + event-study eq. 6; GMM first-stage-correction behavior; always-treated drop) + `did2s` R parity fixture (`benchmarks/R/generate_did2s_golden.R` + `benchmarks/data/did2s_golden.json` + `did2s_test_panel.csv`); then flip `METHODOLOGY_REVIEW.md` TwoStageDiD row In Progress → Complete. PR-A (paper review `gardner-2022-review.md`) merged separately. | `tests/test_methodology_two_stage.py`, `benchmarks/`, `METHODOLOGY_REVIEW.md` | two-stage-validation PR-B | Medium |
 | Port the CI `<notebook-prose>` extraction into the reviewer-eval harness so `docs/tutorials/*.ipynb` cases (currently guarded out of `verify-corpus`/`run`) can be reviewed with CI-equivalent context | `tools/reviewer-eval/adapters/ci_prompt.py` | local-review | Low |
 | **Premise corrected — no CI impact (verified 2026-06-07).** The "slow CI" motivation does not hold: no CI workflow installs R (no `setup-r` / `r-lib/actions` / `fixest` / `r-base` install anywhere in `.github/workflows/`), so every R-parity test skips in CI behind a per-file availability gate (`fixest_available` in twfe, `_check_r_contdid()` in continuous_did, `require_r` / `r_available` in `conftest.py`, etc.) — consolidating `Rscript` spawns yields zero CI speedup. The originally-cited file already session-caches its R fits: `test_methodology_twfe.py` exposes `r_twfe_results` / `r_twfe_results_with_covariate` as `scope="session"` fixtures, so each R model runs once per session, not once per test. The only residual is a LOCAL-dev micro-optimization for developers who have R installed: `test_methodology_continuous_did.py` (the `_run_r_contdid` helper plus three standalone inline `Rscript` calls) and `test_methodology_callaway.py` (`_run_r_estimation` called inline in three test methods, plus `_get_r_mpdta_and_results` re-run by the MPDTA R-parity tests) re-spawn `library(...)` per call with no session-level result cache. Applying the twfe session-fixture pattern there would speed local R-parity runs only. Low value; retained as a local-dev note. | `tests/test_methodology_continuous_did.py`, `tests/test_methodology_callaway.py` | #139 | Low |
 | CS R helpers hard-code `xformla = ~ 1`; no covariate-adjusted R benchmark for IRLS path | `tests/test_methodology_callaway.py` | #202 | Low |
diff --git a/benchmarks/R/generate_did2s_golden.R b/benchmarks/R/generate_did2s_golden.R
new file mode 100644
index 00000000..ac2267eb
--- /dev/null
+++ b/benchmarks/R/generate_did2s_golden.R
@@ -0,0 +1,138 @@
+#!/usr/bin/env Rscript
+# Golden generator: TwoStageDiD vs R `did2s::did2s` (Gardner 2022).
+#
+# Writes a fixed staggered-adoption panel and the R reference estimates so that
+# tests/test_methodology_two_stage.py::TestTwoStageDiDParityR can pin Python
+# output against R without requiring R at test time.
+#
+# Outputs (checked into the repo):
+#   benchmarks/data/did2s_test_panel.csv   (unit, time, first_treat, y)
+#   benchmarks/data/did2s_golden.json
+#
+# Usage:
+#   Rscript benchmarks/R/generate_did2s_golden.R
+#
+# Notes:
+#   - did2s defaults to analytical corrected clustered SEs (`bootstrap = FALSE`),
+#     i.e. the Gardner two-stage GMM sandwich with the GLOBAL Jacobian inverse and
+#     NO finite-sample multiplier. This is exactly what TwoStageDiD computes (see
+#     docs/methodology/REGISTRY.md "## TwoStageDiD"); we cluster on `unit` to match
+#     the library default (cluster=None -> unit).
+#   - The first stage `~ 0 | unit + time` demeans by unit + time FE (which span the
+#     constant); the library's GMM variance re-solves the same exact two-way FE.
+#   - Event study: did2s relative-time is `rel_year = time - first_treat` for
+#     treated units and `Inf` for never-treated; `i(rel_year, ref = c(-1, Inf))`
+#     drops the -1 reference period and excludes never-treated. We compare only the
+#     post-treatment horizons (r >= 0), which map 1:1 to the library's
+#     event_study_effects keys (h = 0 is the first treated period). The pre-period
+#     leads (r < 0) are ~0 under parallel trends and are not part of the library's
+#     default (pretrends=False) event study.
+#   - Same DGP (seed, cohorts, horizon-heterogeneous effect) as the didimputation
+#     golden, so the point estimates coincide (the two-stage and imputation
+#     estimands are algebraically identical); only the SEs differ (GMM sandwich vs
+#     the imputation Theorem-3 variance).
+
+suppressMessages({
+  library(did2s)
+  library(jsonlite)
+  library(data.table)
+})
+
+set.seed(2024)
+
+# ---- Deterministic staggered-adoption panel (parallel trends) ----
+# Cohorts 3 and 5 plus a never-treated group (first_treat = 0); 8 periods.
+n_per_cohort <- 60L
+n_periods <- 8L
+cohorts <- c(0L, 3L, 5L)          # 0 = never-treated
+tau_h <- function(k) 1.0 + 0.5 * k # heterogeneous-by-horizon effect
+
+rows <- list()
+uid <- 0L
+for (g in cohorts) {
+  for (j in seq_len(n_per_cohort)) {
+    c_i <- rnorm(1)
+    for (t in seq_len(n_periods)) {
+      beta_t <- 0.5 * t
+      u <- 0.2 * rnorm(1)
+      treated <- (g > 0L) && (t >= g)
+      eff <- if (treated) tau_h(t - g) else 0.0
+      y <- c_i + beta_t + eff + u
+      rows[[length(rows) + 1L]] <- list(
+        unit = uid, time = t, first_treat = g, y = y
+      )
+    }
+    uid <- uid + 1L
+  }
+}
+panel <- rbindlist(rows)
+panel[, unit := as.integer(unit)]
+panel[, time := as.integer(time)]
+panel[, first_treat := as.integer(first_treat)]
+
+panel_path <- file.path("benchmarks", "data", "did2s_test_panel.csv")
+dir.create(dirname(panel_path), recursive = TRUE, showWarnings = FALSE)
+fwrite(panel, panel_path)
+message(sprintf("Wrote panel: %s (%d rows)", panel_path, nrow(panel)))
+
+# did2s-specific columns (derived from first_treat; not committed in the CSV).
+panel[, treat := (first_treat > 0L) & (time >= first_treat)]
+panel[, rel_year := ifelse(first_treat > 0L, time - first_treat, Inf)]
+
+# ---- Overall ATT (static) ----
+overall <- did2s(
+  panel,
+  yname = "y",
+  first_stage = ~ 0 | unit + time,
+  second_stage = ~ i(treat, ref = FALSE),
+  treatment = "treat",
+  cluster_var = "unit"
+)
+oct <- summary(overall)$coeftable
+overall_att <- as.numeric(oct[1, 1])
+overall_se <- as.numeric(oct[1, 2])
+message(sprintf("Overall ATT = %.8f (SE %.8f)", overall_att, overall_se))
+
+# ---- Event study (post-treatment horizons r >= 0) ----
+es <- did2s(
+  panel,
+  yname = "y",
+  first_stage = ~ 0 | unit + time,
+  second_stage = ~ i(rel_year, ref = c(-1, Inf)),
+  treatment = "treat",
+  cluster_var = "unit"
+)
+ect <- summary(es)$coeftable
+es_h_all <- as.integer(gsub("rel_year::", "", rownames(ect), fixed = TRUE))
+keep <- !is.na(es_h_all) & es_h_all >= 0
+ord <- order(es_h_all[keep])
+es_h <- es_h_all[keep][ord]
+es_att <- as.numeric(ect[keep, 1])[ord]
+es_se <- as.numeric(ect[keep, 2])[ord]
+for (i in seq_along(es_h)) {
+  message(sprintf("  h=%d: ATT=%.6f SE=%.6f", es_h[i], es_att[i], es_se[i]))
+}
+
+golden <- list(
+  estimator = "did2s::did2s",
+  meta = list(
+    r_version = R.version.string,
+    did2s_version = as.character(packageVersion("did2s")),
+    seed = 2024L,
+    n_units = length(unique(panel$unit)),
+    n_periods = n_periods,
+    cohorts = cohorts,
+    se_type = "Corrected Clustered (unit); bootstrap = FALSE (analytical GMM sandwich)",
+    note = paste(
+      "did2s analytical corrected clustered SE = the Gardner (2022) two-stage GMM",
+      "sandwich with the global Jacobian inverse and no finite-sample multiplier.",
+      "Event study compares post-treatment horizons r >= 0 (i(rel_year, ref=c(-1,Inf)))."
+    )
+  ),
+  overall = list(att = overall_att, se = overall_se),
+  event_study = list(horizons = es_h, att = es_att, se = es_se)
+)
+
+golden_path <- file.path("benchmarks", "data", "did2s_golden.json")
+write_json(golden, golden_path, auto_unbox = TRUE, pretty = TRUE, digits = 12)
+message(sprintf("Wrote golden: %s", golden_path))
diff --git a/benchmarks/data/did2s_golden.json b/benchmarks/data/did2s_golden.json
new file mode 100644
index 00000000..6fe1f00e
--- /dev/null
+++ b/benchmarks/data/did2s_golden.json
@@ -0,0 +1,22 @@
+{
+  "estimator": "did2s::did2s",
+  "meta": {
+    "r_version": "R version 4.5.2 (2025-10-31)",
+    "did2s_version": "1.2.1",
+    "seed": 2024,
+    "n_units": 180,
+    "n_periods": 8,
+    "cohorts": [0, 3, 5],
+    "se_type": "Corrected Clustered (unit); bootstrap = FALSE (analytical GMM sandwich)",
+    "note": "did2s analytical corrected clustered SE = the Gardner (2022) two-stage GMM sandwich with the global Jacobian inverse and no finite-sample multiplier. Event study compares post-treatment horizons r >= 0 (i(rel_year, ref=c(-1,Inf)))."
+  },
+  "overall": {
+    "att": 2.045668026901,
+    "se": 0.0281322476332
+  },
+  "event_study": {
+    "horizons": [0, 1, 2, 3, 4, 5],
+    "att": [0.9929366801146, 1.512964957441, 2.006499440277, 2.499854710124, 2.964817090473, 3.46735160262],
+    "se": [0.0291425705467, 0.02572953717585, 0.03023588469852, 0.02950889195402, 0.04306469246781, 0.04307353806204]
+  }
+}
diff --git a/benchmarks/data/did2s_test_panel.csv b/benchmarks/data/did2s_test_panel.csv
new file mode 100644
index 00000000..795d1408
--- /dev/null
+++ b/benchmarks/data/did2s_test_panel.csv
@@ -0,0 +1,1441 @@
+unit,time,first_treat,y
+0,1,0,1.57571241935766
+0,2,0,1.96037514490715
+0,3,0,2.4393937775562
+0,4,0,3.21358910235831
+0,5,0,3.74044038043802
+0,6,0,4.0888988405769
+0,7,0,4.45656246750959
+0,8,0,4.73699432845663
+1,1,0,-0.955736306978872
+1,2,0,-0.0275677162035229
+1,3,0,0.545086679135234
+1,4,0,0.937253018862152
+1,5,0,0.723791533328185
+1,6,0,1.84902654760279
+1,7,0,2.19266095078478
+1,8,0,2.96949156815333
+2,1,0,-1.5084015064213
+2,2,0,-0.378111913878043
+2,3,0,-0.042966650307
+2,4,0,0.602429979328702
+2,5,0,0.851296260003455
+2,6,0,1.47331283565702
+2,7,0,1.58516225918639
+2,8,0,2.37025281899957
+3,1,0,0.773553421186285
+3,2,0,1.66281308359629
+3,3,0,1.8967391557127
+3,4,0,2.61963089020398
+3,5,0,3.08364170717106
+3,6,0,3.92554769256714
+3,7,0,4.0849021050487
+3,8,0,4.76261698109082
+4,1,0,0.739479564853692
+4,2,0,1.2231776627253
+4,3,0,1.923004687699
+4,4,0,2.16283950742251
+4,5,0,2.83639367219199
+4,6,0,3.43640690465978
+4,7,0,4.23189561073763
+4,8,0,4.09795945007321
+5,1,0,0.696250409154038
+5,2,0,1.26407014144064
+5,3,0,1.75823707259106
+5,4,0,2.51852874038656
+5,5,0,2.51953066817035
+5,6,0,2.97729626930948
+5,7,0,3.50109960885476
+5,8,0,3.92493231036827
+6,1,0,-0.467620203725488
+6,2,0,0.498200631038206
+6,3,0,0.771289674442228
+6,4,0,1.51918269111337
+6,5,0,1.81258393257196
+6,6,0,2.63769134378166
+6,7,0,2.99114141499478
+6,8,0,3.426775882052
+7,1,0,1.00442536585164
+7,2,0,1.14946814984633
+7,3,0,1.27614494187586
+7,4,0,2.31817361569775
+7,5,0,2.7735856504085
+7,6,0,2.75148321958136
+7,7,0,3.9015094287719
+7,8,0,3.93360268177416
+8,1,0,0.810919656606769
+8,2,0,1.33509796825539
+8,3,0,1.73331413864552
+8,4,0,2.58012394795544
+8,5,0,2.93256002869384
+8,6,0,3.08809466037152
+8,7,0,3.7990024564743
+8,8,0,4.047223982925
+9,1,0,1.66465200951401
+9,2,0,1.64085127184322
+9,3,0,2.52162062529756
+9,4,0,3.21387389366594
+9,5,0,3.45530839871767
+9,6,0,3.94082142799207
+9,7,0,4.44167164418959
+9,8,0,5.15518317344816
+10,1,0,-0.831661359911138
+10,2,0,-0.368357827183266
+10,3,0,0.410072156567392
+10,4,0,0.882103241637359
+10,5,0,0.859414101818633
+10,6,0,1.5394209127834
+10,7,0,2.03673455706977
+10,8,0,2.76663279055195
+11,1,0,2.09325413143569
+11,2,0,2.2395483474461
+11,3,0,2.8601549884189
+11,4,0,3.17665713989311
+11,5,0,3.39847189994413
+11,6,0,4.2263937983117
+11,7,0,4.61159990784862
+11,8,0,5.38247837176413
+12,1,0,-0.289257031210422
+12,2,0,0.179620271573438
+12,3,0,1.16325667887395
+12,4,0,1.29809414474463
+12,5,0,1.93251912824397
+12,6,0,2.31329931268197
+12,7,0,2.57222068621979
+12,8,0,3.81925292367005
+13,1,0,0.976960925165865
+13,2,0,1.71134856408002
+13,3,0,1.9768653993617
+13,4,0,2.46420562075595
+13,5,0,2.76918384264762
+13,6,0,3.49046122766092
+13,7,0,3.88763110383467
+13,8,0,4.53568739049527
+14,1,0,-0.791853272056159
+14,2,0,-0.187296784744338
+14,3,0,0.530416765901547
+14,4,0,0.732470150162659
+14,5,0,1.4532088556853
+14,6,0,1.79656184517501
+14,7,0,2.43385127094151
+14,8,0,3.26466655229001
+15,1,0,-0.583155656781233
+15,2,0,0.262614329354861
+15,3,0,0.72222756208318
+15,4,0,1.09701695155011
+15,5,0,1.76734636682187
+15,6,0,1.90769767564863
+15,7,0,2.49770816748529
+15,8,0,3.50566869748441
+16,1,0,1.51029210211226
+16,2,0,2.21108464745424
+16,3,0,2.39172676176666
+16,4,0,3.03846284438367
+16,5,0,3.57410107529903
+16,6,0,3.63656357697657
+16,7,0,4.35531989841439
+16,8,0,5.34275735958087
+17,1,0,1.59099751867213
+17,2,0,1.65413717936619
+17,3,0,2.81028796905675
+17,4,0,3.18328780802577
+17,5,0,3.60193326786874
+17,6,0,4.30288969049577
+17,7,0,5.09222854841844
+17,8,0,5.24392584530123
+18,1,0,0.807934311011389
+18,2,0,0.919434492380615
+18,3,0,1.36895852651825
+18,4,0,2.02293951095349
+18,5,0,2.15047307163489
+18,6,0,3.24011684506403
+18,7,0,3.93474369140224
+18,8,0,3.92183101868943
+19,1,0,1.32801513099924
+19,2,0,1.96742115703838
+19,3,0,2.50553624597495
+19,4,0,3.1571181882546
+19,5,0,3.67433692462913
+19,6,0,3.95909268282874
+19,7,0,4.56895422521186
+19,8,0,4.85022982238357
+20,1,0,1.24640274552686
+20,2,0,1.82949029027603
+20,3,0,2.06251105183055
+20,4,0,2.6204320278319
+20,5,0,3.34593099675153
+20,6,0,3.89479306001932
+20,7,0,4.12342129630388
+20,8,0,4.56147002880037
+21,1,0,0.860756519278362
+21,2,0,1.2694433258087
+21,3,0,2.08956104088926
+21,4,0,2.69159713038917
+21,5,0,3.23602267039261
+21,6,0,3.59518778140755
+21,7,0,4.23131902599625
+21,8,0,4.57617578349933
+22,1,0,2.43391823145599
+22,2,0,3.29698778720204
+22,3,0,3.89162276187279
+22,4,0,4.14100139568931
+22,5,0,4.83762542995638
+22,6,0,5.50105884329399
+22,7,0,5.5945345328711
+22,8,0,6.14616319307488
+23,1,0,0.820304078069636
+23,2,0,1.23046468721595
+23,3,0,1.50600314831064
+23,4,0,2.25606058438898
+23,5,0,2.53143432820006
+23,6,0,2.59572892973837
+23,7,0,3.77272896159646
+23,8,0,4.22403778880971
+24,1,0,1.30320440910979
+24,2,0,1.57940840743969
+24,3,0,2.41463681558043
+24,4,0,2.56829935718091
+24,5,0,3.49780858741517
+24,6,0,3.72559263134312
+24,7,0,4.18400971215264
+24,8,0,4.70396790016707
+25,1,0,0.722394762356275
+25,2,0,1.3311251714644
+25,3,0,1.68814275470761
+25,4,0,2.36251135873304
+25,5,0,3.00874464026825
+25,6,0,3.37448669083085
+25,7,0,3.69537799006947
+25,8,0,4.16475050532772
+26,1,0,0.485466137435106
+26,2,0,0.955298995135348
+26,3,0,1.3648799323756
+26,4,0,1.98362552002136
+26,5,0,2.28467222598593
+26,6,0,3.01150181904857
+26,7,0,3.68634796548661
+26,8,0,4.24100706573287
+27,1,0,1.45213953627161
+27,2,0,2.51251641366008
+27,3,0,2.52124732803043
+27,4,0,3.18575111662751
+27,5,0,3.40979315749624
+27,6,0,4.16480766882068
+27,7,0,4.9800117454749
+27,8,0,5.16866225517743
+28,1,0,1.47041955448512
+28,2,0,2.08037536287145
+28,3,0,2.21113330902964
+28,4,0,3.05554611460444
+28,5,0,3.49374311745772
+28,6,0,4.1364449581819
+28,7,0,4.29578854745349
+28,8,0,4.97821774925704
+29,1,0,0.262356292405827
+29,2,0,0.505228319934758
+29,3,0,0.648309820648464
+29,4,0,1.77814500276858
+29,5,0,2.06551571971985
+29,6,0,2.80221659444927
+29,7,0,3.26443933955507
+29,8,0,3.84146488548468
+30,1,0,1.35892475350633
+30,2,0,1.62643339657715
+30,3,0,2.23187200983097
+30,4,0,3.00915928829472
+30,5,0,3.40093817799691
+30,6,0,3.77078234223752
+30,7,0,4.07720483163503
+30,8,0,4.70776265031241
+31,1,0,0.514284092131842
+31,2,0,0.929242002262637
+31,3,0,1.65404943247237
+31,4,0,1.92599558635916
+31,5,0,2.50336926100635
+31,6,0,2.80292712881225
+31,7,0,3.22548807917408
+31,8,0,3.89407585007867
+32,1,0,2.1037921374156
+32,2,0,1.9591790558313
+32,3,0,2.75788925585417
+32,4,0,3.42871103233711
+32,5,0,3.51374483193496
+32,6,0,4.16476294003874
+32,7,0,4.73817884229738
+32,8,0,5.35965712639246
+33,1,0,-0.983878865913057
+33,2,0,-0.201733160596227
+33,3,0,0.413200978341906
+33,4,0,0.644485139682032
+33,5,0,1.20737835692523
+33,6,0,1.98446074191797
+33,7,0,2.58941144241347
+33,8,0,2.86083222928688
+34,1,0,0.534940413004529
+34,2,0,0.927812496777354
+34,3,0,1.65556229998545
+34,4,0,1.96977450879971
+34,5,0,2.12481216904322
+34,6,0,2.56992884563598
+34,7,0,3.21788407818519
+34,8,0,4.47795248685649
+35,1,0,0.0704538455930257
+35,2,0,0.586045127464323
+35,3,0,1.28693189219002
+35,4,0,1.45015426885344
+35,5,0,1.86472666254589
+35,6,0,2.6183900614437
+35,7,0,3.1560522061993
+35,8,0,3.44252965913117
+36,1,0,3.12452067281464
+36,2,0,3.45229798099662
+36,3,0,4.29435562137185
+36,4,0,4.26660787900754
+36,5,0,5.0842455742871
+36,6,0,5.67391110796837
+36,7,0,6.0321764957069
+36,8,0,6.9248556835705
+37,1,0,0.248010121268287
+37,2,0,0.218757774433319
+37,3,0,1.29582269643126
+37,4,0,1.10673143478588
+37,5,0,2.0234932808739
+37,6,0,2.61035176377423
+37,7,0,2.89106753234132
+37,8,0,3.20359053825622
+38,1,0,0.834482571899491
+38,2,0,1.4723499908137
+38,3,0,2.1300539482473
+38,4,0,2.09732015937341
+38,5,0,2.92321674683984
+38,6,0,3.09833923540185
+38,7,0,4.08675476446452
+38,8,0,4.24565060260494
+39,1,0,1.44107802593
+39,2,0,1.64489916188785
+39,3,0,1.88851144894041
+39,4,0,2.69471610525367
+39,5,0,3.44232085355739
+39,6,0,3.95954657224985
+39,7,0,4.53142848793589
+39,8,0,4.57597793926637
+40,1,0,0.364647298466558
+40,2,0,0.604784173033909
+40,3,0,1.12422873073565
+40,4,0,1.84301494999195
+40,5,0,2.14882705365019
+40,6,0,2.78891907777397
+40,7,0,3.27795453488485
+40,8,0,3.87275888633489
+41,1,0,0.137967529800507
+41,2,0,0.204201956718598
+41,3,0,0.706650016793307
+41,4,0,1.25929805115561
+41,5,0,1.79895672923494
+41,6,0,2.26316227343556
+41,7,0,2.74737465082776
+41,8,0,3.19656774941229
+42,1,0,2.65694491490471
+42,2,0,3.36021547561185
+42,3,0,3.77649375599502
+42,4,0,4.37832197992893
+42,5,0,4.83162233107423
+42,6,0,5.63097688304612
+42,7,0,5.9035177657234
+42,8,0,6.00244172559591
+43,1,0,1.29459339196085
+43,2,0,2.06229387101955
+43,3,0,2.32393491880218
+43,4,0,2.96976054361446
+43,5,0,3.49981008932945
+43,6,0,3.5922969887643
+43,7,0,4.60474195517341
+43,8,0,4.93145452060186
+44,1,0,0.0537387830451247
+44,2,0,0.315467553812196
+44,3,0,1.04907888135422
+44,4,0,1.98578147714612
+44,5,0,2.243375433765
+44,6,0,2.62367532135965
+44,7,0,2.91980931154629
+44,8,0,3.77728561241702
+45,1,0,-1.51094473126128
+45,2,0,-1.08114119663183
+45,3,0,-0.495719551502474
+45,4,0,0.0807982002853336
+45,5,0,0.719393630593725
+45,6,0,0.955141386401007
+45,7,0,1.60386116012261
+45,8,0,2.08632671800602
+46,1,0,1.25483524625893
+46,2,0,1.52349388053221
+46,3,0,2.07864012836276
+46,4,0,2.42760483592431
+46,5,0,3.36163054876707
+46,6,0,3.39605200020633
+46,7,0,4.55456273987458
+46,8,0,4.71915089881398
+47,1,0,0.237988571102958
+47,2,0,0.508867690357799
+47,3,0,1.35934282325916
+47,4,0,1.29053284350802
+47,5,0,2.2285081029067
+47,6,0,2.85054722723214
+47,7,0,3.48715870521065
+47,8,0,3.85639684798515
+48,1,0,1.29755155236581
+48,2,0,2.17913342070303
+48,3,0,2.6784298272036
+48,4,0,2.6548752237907
+48,5,0,3.30206584127549
+48,6,0,3.86254626518904
+48,7,0,4.47816735226851
+48,8,0,4.85765399592985
+49,1,0,2.01909707704745
+49,2,0,2.47308897703961
+49,3,0,3.17407097381774
+49,4,0,3.48203011921528
+49,5,0,3.81709557266252
+49,6,0,4.45747821086635
+49,7,0,5.09981572791837
+49,8,0,5.4802102218025
+50,1,0,-0.448837895689941
+50,2,0,-0.298643802901852
+50,3,0,-0.080419715770935
+50,4,0,1.02593965853912
+50,5,0,1.56524071083903
+50,6,0,1.96652476356899
+50,7,0,2.52012227980579
+50,8,0,2.63075998602788
+51,1,0,1.86181485827551
+51,2,0,2.35816003559163
+51,3,0,2.83862141095108
+51,4,0,3.46711621495583
+51,5,0,3.41682084283998
+51,6,0,4.22243929403655
+51,7,0,4.75847326488305
+51,8,0,5.49054291782366
+52,1,0,2.23352514687859
+52,2,0,3.00549552536479
+52,3,0,3.30909077188148
+52,4,0,3.78770896749406
+52,5,0,4.34257127024374
+52,6,0,4.80766785701701
+52,7,0,5.34334476016836
+52,8,0,5.83860145420842
+53,1,0,-0.344559747867141
+53,2,0,0.395564116331045
+53,3,0,0.825027952693013
+53,4,0,1.14369923051778
+53,5,0,1.62492672907177
+53,6,0,2.3809402166797
+53,7,0,2.58561529765697
+53,8,0,3.33341961530725
+54,1,0,0.314539815534906
+54,2,0,0.669260561259938
+54,3,0,1.46722345622402
+54,4,0,1.76349563660827
+54,5,0,2.24079497205936
+54,6,0,2.84479884210006
+54,7,0,3.22654888876197
+54,8,0,3.59724632826831
+55,1,0,0.525365760401837
+55,2,0,1.3340038530387
+55,3,0,1.34050524765035
+55,4,0,2.25559773033464
+55,5,0,2.79970276099734
+55,6,0,2.99841464625356
+55,7,0,3.80445972387623
+55,8,0,4.14782592813191
+56,1,0,2.56790198143581
+56,2,0,2.81550010461755
+56,3,0,4.08671392398119
+56,4,0,4.21723021006283
+56,5,0,4.51843018970835
+56,6,0,5.24797068517281
+56,7,0,5.82256153956169
+56,8,0,6.04644022917596
+57,1,0,1.27608262164854
+57,2,0,1.72002062777107
+57,3,0,2.25420838813264
+57,4,0,2.81222438525511
+57,5,0,3.34292214381018
+57,6,0,3.46463499145415
+57,7,0,4.22766560217851
+57,8,0,4.79839845669278
+58,1,0,-0.658053842903073
+58,2,0,0.160768557746596
+58,3,0,1.07139341593188
+58,4,0,1.00688918370808
+58,5,0,1.89184774305466
+58,6,0,2.38059852440773
+58,7,0,2.80089686759589
+58,8,0,3.06344299321046
+59,1,0,-0.70984085235411
+59,2,0,-0.0343084470278902
+59,3,0,0.385499554590131
+59,4,0,1.24525769117722
+59,5,0,1.31020609466661
+59,6,0,1.67159139528118
+59,7,0,1.97309584979105
+59,8,0,2.87245086682389
+60,1,3,-0.719268682234314
+60,2,3,-0.0560185689875659
+60,3,3,1.16254141100696
+60,4,3,2.29921974917942
+60,5,3,3.69311261584687
+60,6,3,4.51382655685811
+60,7,3,5.24160398532174
+60,8,3,6.30226657904429
+61,1,3,0.677801658691706
+61,2,3,1.11477775412072
+61,3,3,2.62091150194693
+61,4,3,3.49229117282332
+61,5,3,4.49100206304702
+61,6,3,5.58189534770296
+61,7,3,6.19386990222223
+61,8,3,7.83930335657842
+62,1,3,0.305561330615861
+62,2,3,0.970588062449194
+62,3,3,2.59487901961403
+62,4,3,3.0754979630223
+62,5,3,4.18157239389626
+62,6,3,4.86817117141176
+62,7,3,6.25367435797187
+62,8,3,7.18548577800311
+63,1,3,1.05824399624178
+63,2,3,1.15297460223181
+63,3,3,2.8240327290642
+63,4,3,3.83303468974252
+63,5,3,5.18592435458602
+63,6,3,5.72911534903804
+63,7,3,7.18861430423524
+63,8,3,8.33039959696296
+64,1,3,-0.101927785431208
+64,2,3,0.660565642440033
+64,3,3,1.64434385128881
+64,4,3,2.68789348574499
+64,5,3,4.09861609518632
+64,6,3,4.56080665651255
+64,7,3,5.75303580684061
+64,8,3,7.09034460439771
+65,1,3,0.701189036899391
+65,2,3,0.979286949554214
+65,3,3,3.03584391610724
+65,4,3,3.97128878578281
+65,5,3,4.67602103277406
+65,6,3,5.42694651882043
+65,7,3,6.40152363172409
+65,8,3,7.7702764531453
+66,1,3,0.160452702464162
+66,2,3,0.656226548790744
+66,3,3,2.52096178512908
+66,4,3,3.32607449371447
+66,5,3,4.16196429261208
+66,6,3,4.70692724161365
+66,7,3,6.48121105900816
+66,8,3,7.34421580480872
+67,1,3,0.673442654651696
+67,2,3,0.962843749444522
+67,3,3,2.60775819037386
+67,4,3,3.21087811843561
+67,5,3,4.61287762004842
+67,6,3,5.53125653664404
+67,7,3,6.38192420044873
+67,8,3,7.86926085559182
+68,1,3,0.166620263373829
+68,2,3,0.443630391490223
+68,3,3,1.6747810326122
+68,4,3,3.02439782857839
+68,5,3,4.30195869883444
+68,6,3,4.857825046864
+68,7,3,5.87638580951173
+68,8,3,7.01097500867526
+69,1,3,-0.641779152999048
+69,2,3,-0.221423278943882
+69,3,3,1.38658267743202
+69,4,3,2.60253269489247
+69,5,3,3.37017948890763
+69,6,3,4.19563369315284
+69,7,3,5.39454581410649
+69,8,3,6.39442601137506
+70,1,3,-1.07865231004769
+70,2,3,-0.715077800569542
+70,3,3,1.05579627565737
+70,4,3,1.96080227310335
+70,5,3,3.36852477828328
+70,6,3,3.90080254850334
+70,7,3,5.26167711925935
+70,8,3,6.15880759001324
+71,1,3,1.75033664102872
+71,2,3,2.30642048048951
+71,3,3,4.04696319603754
+71,4,3,5.05916472988662
+71,5,3,5.94546718107763
+71,6,3,6.6467891244918
+71,7,3,7.71910730541043
+71,8,3,9.30854464461094
+72,1,3,-0.400589123970168
+72,2,3,0.704652518789466
+72,3,3,1.84079466718125
+72,4,3,2.92576769505844
+72,5,3,3.95249416878941
+72,6,3,5.11855961618223
+72,7,3,6.22523006134833
+72,8,3,6.87750866417755
+73,1,3,0.147520541623632
+73,2,3,0.690283853463692
+73,3,3,2.30283363578552
+73,4,3,3.07028262117547
+73,5,3,4.21465755496875
+73,6,3,5.35526948570922
+73,7,3,5.84832192155986
+73,8,3,7.28878335074962
+74,1,3,-0.0478317593398545
+74,2,3,0.268926163269274
+74,3,3,1.70095356686444
+74,4,3,2.88125406809142
+74,5,3,3.75440375750523
+74,6,3,4.72843651831305
+74,7,3,5.62862501563621
+74,8,3,6.69591708668679
+75,1,3,-0.405809940825969
+75,2,3,0.0900305887133675
+75,3,3,1.7184987102796
+75,4,3,2.32097672798748
+75,5,3,3.57509620039659
+75,6,3,4.70212505734094
+75,7,3,5.66087902239169
+75,8,3,7.18631191136391
+76,1,3,1.02167361969083
+76,2,3,1.69794074051215
+76,3,3,3.07623542965749
+76,4,3,3.95508721058957
+76,5,3,4.83444155561634
+76,6,3,6.00071282815628
+76,7,3,7.01811712970264
+76,8,3,8.23370032057519
+77,1,3,1.64576727904779
+77,2,3,1.82928182967484
+77,3,3,3.41017195105545
+77,4,3,4.82781016841801
+77,5,3,5.84019216911828
+77,6,3,6.17451297287111
+77,7,3,7.29827289804374
+77,8,3,8.72409327068498
+78,1,3,0.133857463187289
+78,2,3,0.725497407732272
+78,3,3,2.12147131221335
+78,4,3,3.32458703624904
+78,5,3,4.21752200613813
+78,6,3,5.43826635440585
+78,7,3,6.44931666372289
+78,8,3,7.22951162055743
+79,1,3,0.0664085065342783
+79,2,3,0.612877586450337
+79,3,3,2.36715889678118
+79,4,3,2.96946782437747
+79,5,3,4.00410947355307
+79,6,3,5.40236491055802
+79,7,3,6.14988628888044
+79,8,3,7.0871738675892
+80,1,3,1.13367864223052
+80,2,3,1.7411622309359
+80,3,3,3.35438805520481
+80,4,3,4.17766379974428
+80,5,3,5.55375054425833
+80,6,3,6.52162478755372
+80,7,3,7.57538743044453
+80,8,3,8.53532311496596
+81,1,3,0.164198356324343
+81,2,3,1.08220392452701
+81,3,3,2.10066530441217
+81,4,3,3.39049462694941
+81,5,3,4.0423463256487
+81,6,3,5.29608644567218
+81,7,3,6.5846947763314
+81,8,3,7.45171907268693
+82,1,3,1.95292002267312
+82,2,3,2.66804235065524
+82,3,3,3.93328273263077
+82,4,3,5.36347054378852
+82,5,3,6.06683860839213
+82,6,3,6.94301654777259
+82,7,3,8.21118877718264
+82,8,3,9.15614741092001
+83,1,3,0.371526712127935
+83,2,3,0.581254274680354
+83,3,3,2.1198225582618
+83,4,3,3.29057531333062
+83,5,3,4.54895821309303
+83,6,3,5.15700625028827
+83,7,3,6.38443767309559
+83,8,3,7.08306530026089
+84,1,3,0.434283865463328
+84,2,3,0.662768371723026
+84,3,3,2.87459496588888
+84,4,3,3.20320613575708
+84,5,3,4.37130674088997
+84,6,3,5.03720949577778
+84,7,3,6.3607043867553
+84,8,3,7.01369330906244
+85,1,3,-0.367309155119938
+85,2,3,0.259182324825179
+85,3,3,1.12220553893189
+85,4,3,2.65130106442621
+85,5,3,3.60964708718875
+85,6,3,4.85380686871999
+85,7,3,5.62351296773631
+85,8,3,6.70124838920003
+86,1,3,0.319974969587851
+86,2,3,0.406142972357535
+86,3,3,1.926672712444
+86,4,3,3.24662671689068
+86,5,3,4.28581787269771
+86,6,3,5.17423004530816
+86,7,3,6.4216854231382
+86,8,3,7.11098904575663
+87,1,3,-0.0104903757766827
+87,2,3,0.248276263315795
+87,3,3,1.60551697625337
+87,4,3,2.93540731541587
+87,5,3,4.0463209954738
+87,6,3,4.87959240162977
+87,7,3,6.05741196274632
+87,8,3,6.78595609626293
+88,1,3,2.1566158885518
+88,2,3,2.37700426614069
+88,3,3,4.67890547088398
+88,4,3,5.26815327424333
+88,5,3,6.29140120414791
+88,6,3,6.89251845648695
+88,7,3,8.29889197576343
+88,8,3,9.02620478742367
+89,1,3,0.522823311622717
+89,2,3,0.720894021704307
+89,3,3,2.0481538775907
+89,4,3,3.53706437353728
+89,5,3,4.08516480881842
+89,6,3,5.19435218107831
+89,7,3,6.35222386583984
+89,8,3,7.58938749114136
+90,1,3,0.110478214499304
+90,2,3,0.361221016034652
+90,3,3,1.69191466379943
+90,4,3,2.93434401191667
+90,5,3,3.88750436417871
+90,6,3,5.01369680784638
+90,7,3,5.70314936370187
+90,8,3,6.77412158237975
+91,1,3,0.158076458872298
+91,2,3,0.748418876833585
+91,3,3,2.33519900265607
+91,4,3,3.47993652356401
+91,5,3,4.22316386563191
+91,6,3,5.59531669381371
+91,7,3,6.29475494423444
+91,8,3,7.31018232042381
+92,1,3,-0.845667721672671
+92,2,3,-0.0952237476902149
+92,3,3,1.09449929433509
+92,4,3,2.32697635305069
+92,5,3,3.53162160382455
+92,6,3,4.18412212993421
+92,7,3,5.43072655030752
+92,8,3,6.2528718321991
+93,1,3,-0.554396371599341
+93,2,3,-0.487548471136003
+93,3,3,0.872855049255904
+93,4,3,2.62512632052788
+93,5,3,3.63237392425111
+93,6,3,3.8332724417017
+93,7,3,5.38698788925701
+93,8,3,6.51892807410421
+94,1,3,1.74353621268684
+94,2,3,2.62240125056578
+94,3,3,3.4658937291633
+94,4,3,4.71292088569413
+94,5,3,5.65289936064052
+94,6,3,6.97774155638962
+94,7,3,7.5748409825114
+94,8,3,8.32376512209955
+95,1,3,1.24501550075218
+95,2,3,1.27936709974271
+95,3,3,2.86647376653648
+95,4,3,3.84154660633742
+95,5,3,4.71191176207907
+95,6,3,5.65740174075242
+95,7,3,6.81485607094101
+95,8,3,7.98414735973719
+96,1,3,0.835625442247262
+96,2,3,1.63524961836139
+96,3,3,2.88242212021634
+96,4,3,3.75660238927897
+96,5,3,4.91209469035123
+96,6,3,6.12390731013328
+96,7,3,7.04894242300733
+96,8,3,7.86466760806126
+97,1,3,-0.0227065616852005
+97,2,3,0.570270933257905
+97,3,3,1.53086916477496
+97,4,3,2.70270441960765
+97,5,3,3.65556887712201
+97,6,3,4.52944799962726
+97,7,3,5.81665674179551
+97,8,3,6.26737176243005
+98,1,3,0.1608080379801
+98,2,3,0.746437314670479
+98,3,3,2.23111744027251
+98,4,3,3.2078673406281
+98,5,3,4.40130516506188
+98,6,3,5.24395988270281
+98,7,3,6.30157721343787
+98,8,3,7.03498514715133
+99,1,3,1.8412747626686
+99,2,3,2.24192297245743
+99,3,3,4.14772668698801
+99,4,3,4.57029118484342
+99,5,3,5.97678772699363
+99,6,3,7.40705263000443
+99,7,3,8.20310972268435
+99,8,3,8.95583458420776
+100,1,3,0.853603661254639
+100,2,3,1.15557667523494
+100,3,3,3.01447867943591
+100,4,3,4.17018419684842
+100,5,3,4.97210964256587
+100,6,3,5.83965101740009
+100,7,3,6.92479587369302
+100,8,3,7.97958531088616
+101,1,3,2.39580725217959
+101,2,3,2.86449727737082
+101,3,3,4.17312547480334
+101,4,3,5.53806752827575
+101,5,3,6.39501064043096
+101,6,3,7.26811215000094
+101,7,3,8.53649514251206
+101,8,3,9.45485039379188
+102,1,3,0.595676431885384
+102,2,3,1.05348373913475
+102,3,3,2.63582181740774
+102,4,3,3.7167673944854
+102,5,3,4.52909719292877
+102,6,3,5.41266353430887
+102,7,3,6.43924226913368
+102,8,3,7.18526772340207
+103,1,3,1.5107003697783
+103,2,3,2.12260957942516
+103,3,3,3.18644365578075
+103,4,3,4.53918387496874
+103,5,3,5.06341455602335
+103,6,3,6.49283917995712
+103,7,3,7.75323395581945
+103,8,3,8.35667316156763
+104,1,3,1.88866758741314
+104,2,3,2.18231899894117
+104,3,3,3.71554884855156
+104,4,3,4.68146750843871
+104,5,3,6.10361571530113
+104,6,3,6.67083431262761
+104,7,3,7.7800345307164
+104,8,3,8.92777893266183
+105,1,3,1.73762508172132
+105,2,3,2.32733246243909
+105,3,3,3.55460759166834
+105,4,3,4.86889777317026
+105,5,3,6.08193842927183
+105,6,3,7.04148805029592
+105,7,3,7.35974106864338
+105,8,3,8.86588126079279
+106,1,3,-0.537630215045871
+106,2,3,0.353196144159918
+106,3,3,1.74238421667739
+106,4,3,2.7909111088959
+106,5,3,3.97419562263608
+106,6,3,4.98873067434378
+106,7,3,5.61659205454897
+106,8,3,6.92400449203583
+107,1,3,-1.51336148966568
+107,2,3,-0.987771702119031
+107,3,3,0.510206440394591
+107,4,3,1.65263967956168
+107,5,3,2.2231983838614
+107,6,3,3.25914843042085
+107,7,3,4.77817247624947
+107,8,3,5.46718795872109
+108,1,3,-0.621733240381946
+108,2,3,0.0353380174099048
+108,3,3,1.476436324534
+108,4,3,2.4821327383924
+108,5,3,3.66505516491496
+108,6,3,4.26493239579394
+108,7,3,5.03385050201781
+108,8,3,6.15505561370686
+109,1,3,0.65698658036601
+109,2,3,0.879559369971633
+109,3,3,2.45640402745839
+109,4,3,3.78941510679771
+109,5,3,4.73263239779589
+109,6,3,5.33156639906988
+109,7,3,6.54950383050795
+109,8,3,7.66046552316943
+110,1,3,-0.278290439134022
+110,2,3,-0.00531568792845563
+110,3,3,1.32161966941723
+110,4,3,2.67616444547022
+110,5,3,3.77072238532519
+110,6,3,4.16247631364279
+110,7,3,5.47110653208128
+110,8,3,6.27650112074593
+111,1,3,1.26035159610515
+111,2,3,1.67081325682447
+111,3,3,3.09018449245164
+111,4,3,4.19629820774435
+111,5,3,5.23300416518958
+111,6,3,6.12336713464138
+111,7,3,7.09543733662832
+111,8,3,7.9910425192222
+112,1,3,0.768782327309874
+112,2,3,1.49434073539369
+112,3,3,3.0220678108492
+112,4,3,3.67076347692299
+112,5,3,4.84071776365439
+112,6,3,5.42711417981796
+112,7,3,6.74530298073304
+112,8,3,8.05337576385035
+113,1,3,-0.512842567222705
+113,2,3,0.333803710315301
+113,3,3,1.77579958118868
+113,4,3,2.70429446040193
+113,5,3,3.44439599189216
+113,6,3,4.74446930915482
+113,7,3,5.78131060905661
+113,8,3,6.75110414102881
+114,1,3,-0.185357563549793
+114,2,3,0.304278260357402
+114,3,3,1.40795238760697
+114,4,3,2.3141226921558
+114,5,3,3.26458761204062
+114,6,3,4.53496954953977
+114,7,3,5.24542510625861
+114,8,3,6.2486231935731
+115,1,3,0.244762750632874
+115,2,3,0.182902600080998
+115,3,3,2.15982418169374
+115,4,3,2.93209076379569
+115,5,3,3.56306414845394
+115,6,3,5.15462368228077
+115,7,3,5.96139690314987
+115,8,3,6.87629331865792
+116,1,3,0.899813559307228
+116,2,3,1.29836655405475
+116,3,3,2.43105086510445
+116,4,3,3.3601921105621
+116,5,3,4.60638982728935
+116,6,3,5.80556147441881
+116,7,3,6.73591475723964
+116,8,3,7.67421963338842
+117,1,3,0.670377457370441
+117,2,3,0.655388579586071
+117,3,3,2.32780928316695
+117,4,3,3.08664711993556
+117,5,3,4.36954660417253
+117,6,3,5.42122299400728
+117,7,3,6.26240238609592
+117,8,3,7.20523579801548
+118,1,3,-0.821146989568458
+118,2,3,-0.396815103673733
+118,3,3,1.65289439528466
+118,4,3,2.4546710056127
+118,5,3,3.56108791933014
+118,6,3,4.48732065309802
+118,7,3,5.81742309222639
+118,8,3,6.16741022457937
+119,1,3,0.2512503215211
+119,2,3,0.609118567341081
+119,3,3,2.2904153273313
+119,4,3,3.16269752131974
+119,5,3,3.98072316436875
+119,6,3,5.02143044194653
+119,7,3,6.16454816370158
+119,8,3,6.85345983286224
+120,1,5,2.07355769991514
+120,2,5,2.70176141101685
+120,3,5,3.07532887658374
+120,4,5,3.54183584355425
+120,5,5,5.35574267530824
+120,6,5,5.94031247762939
+120,7,5,6.998965831818
+120,8,5,8.25391424022031
+121,1,5,0.545705645898663
+121,2,5,1.59458162025258
+121,3,5,1.77632380720174
+121,4,5,2.39303998153606
+121,5,5,3.69611036335378
+121,6,5,4.78728498785316
+121,7,5,5.77511141255148
+121,8,5,6.62925759767466
+122,1,5,0.348648056262885
+122,2,5,1.32497888174757
+122,3,5,1.7792456248239
+122,4,5,2.06883846870991
+122,5,5,3.55762813412755
+122,6,5,4.61186449320064
+122,7,5,5.94469339723567
+122,8,5,7.22402603967549
+123,1,5,0.11473014330258
+123,2,5,0.511124556261185
+123,3,5,1.14695019214237
+123,4,5,1.619003464941
+123,5,5,2.98883880634006
+123,6,5,4.318828002304
+123,7,5,5.04118314730071
+123,8,5,5.89015258629618
+124,1,5,0.353026663542674
+124,2,5,0.463146009094651
+124,3,5,1.03337658770784
+124,4,5,1.59307067258282
+124,5,5,3.02330619421151
+124,6,5,4.03816633979783
+124,7,5,5.0128772757355
+124,8,5,6.354336326349
+125,1,5,1.30759376251991
+125,2,5,1.43063736991179
+125,3,5,2.40942319627773
+125,4,5,2.52882564103711
+125,5,5,4.17879800486291
+125,6,5,5.07127883112598
+125,7,5,6.16587286371303
+125,8,5,7.33375556255224
+126,1,5,-0.0531641770028999
+126,2,5,0.164519460371381
+126,3,5,0.55992507886714
+126,4,5,1.30926892866658
+126,5,5,2.66056927655847
+126,6,5,3.59413029414135
+126,7,5,4.45258367453011
+126,8,5,5.75588726052884
+127,1,5,2.14707661024169
+127,2,5,2.7033355439571
+127,3,5,3.7305747086312
+127,4,5,4.03212587485596
+127,5,5,5.17105936912862
+127,6,5,6.50471451427771
+127,7,5,7.38290008652188
+127,8,5,8.61332661567901
+128,1,5,0.347949394487828
+128,2,5,0.515133037384221
+128,3,5,0.946661617926291
+128,4,5,1.5863912063442
+128,5,5,3.07486872976229
+128,6,5,4.00854958247365
+128,7,5,4.95533265684274
+128,8,5,6.13097781768475
+129,1,5,1.15350266866743
+129,2,5,1.67385258352653
+129,3,5,1.90531572360025
+129,4,5,2.64547351304772
+129,5,5,3.65116556797884
+129,6,5,5.12513430486377
+129,7,5,6.24326209088698
+129,8,5,7.39585785944324
+130,1,5,2.00368331013529
+130,2,5,2.31096321570056
+130,3,5,3.14897486683582
+130,4,5,3.60000356149414
+130,5,5,5.02521231648775
+130,6,5,5.89867609327909
+130,7,5,7.12214412591731
+130,8,5,8.08117262469428
+131,1,5,-0.160822734391134
+131,2,5,0.722303945240265
+131,3,5,0.697364647519555
+131,4,5,1.1385294912458
+131,5,5,2.83950274529066
+131,6,5,3.97833391764805
+131,7,5,4.5028962883293
+131,8,5,6.02560936653114
+132,1,5,0.579307594077414
+132,2,5,1.28304853736308
+132,3,5,1.5125162008707
+132,4,5,2.31460690732269
+132,5,5,3.7141364128688
+132,6,5,4.8465635471783
+132,7,5,5.57115008644628
+132,8,5,6.55282272317461
+133,1,5,-0.852547031973224
+133,2,5,-0.162648504898197
+133,3,5,0.125175541718792
+133,4,5,1.00399240977452
+133,5,5,2.02288849499148
+133,6,5,3.55731287964353
+133,7,5,4.15789446972334
+133,8,5,5.26073588877272
+134,1,5,-0.490900026742535
+134,2,5,0.118353023295689
+134,3,5,0.590971956082911
+134,4,5,1.08995593062194
+134,5,5,2.62672336169321
+134,6,5,3.38336882640513
+134,7,5,4.47244614261959
+134,8,5,5.99903342583879
+135,1,5,1.12327735834762
+135,2,5,1.37526813862002
+135,3,5,1.67058112798098
+135,4,5,2.23269845844678
+135,5,5,4.38053720522963
+135,6,5,5.0119754398454
+135,7,5,5.75715487827219
+135,8,5,6.87080838229093
+136,1,5,0.641149283406735
+136,2,5,0.833649121738602
+136,3,5,1.34980559006897
+136,4,5,1.76155690834355
+136,5,5,3.35038151623076
+136,6,5,4.41802895727517
+136,7,5,5.31236990270441
+136,8,5,6.26817268927078
+137,1,5,0.696355760662733
+137,2,5,1.00600830150395
+137,3,5,1.60557592949137
+137,4,5,2.03956174255656
+137,5,5,3.52003551528159
+137,6,5,4.37233094089374
+137,7,5,6.02506245389463
+137,8,5,6.76338217231818
+138,1,5,1.48044969692376
+138,2,5,1.69156360234213
+138,3,5,2.3230020268256
+138,4,5,2.60046145396267
+138,5,5,3.99689834307206
+138,6,5,5.34886941673575
+138,7,5,6.40194897896813
+138,8,5,7.43356794691465
+139,1,5,1.12132230347744
+139,2,5,2.04919566667762
+139,3,5,2.37066285188843
+139,4,5,3.04517876839378
+139,5,5,4.80960943033978
+139,6,5,5.52162367612219
+139,7,5,6.60265929863807
+139,8,5,7.50480641176629
+140,1,5,-0.796095474230791
+140,2,5,-0.452579279400348
+140,3,5,0.00947453753975691
+140,4,5,0.833452724844031
+140,5,5,2.57661594087727
+140,6,5,3.25171966543205
+140,7,5,4.21904639603497
+140,8,5,5.27228467034679
+141,1,5,1.44585179068538
+141,2,5,2.08873295069904
+141,3,5,2.9477320311015
+141,4,5,3.16217368658878
+141,5,5,4.47270459846095
+141,6,5,5.73960305079448
+141,7,5,6.40373672551532
+141,8,5,7.55728689115746
+142,1,5,-0.419083902232799
+142,2,5,0.234429057049568
+142,3,5,0.545730486376036
+142,4,5,1.5344252444695
+142,5,5,2.65952327092003
+142,6,5,4.24797783021148
+142,7,5,5.06698135435331
+142,8,5,6.20296172042246
+143,1,5,-1.5759748924097
+143,2,5,-0.86876312251632
+143,3,5,-0.490970695678289
+143,4,5,-0.490913878274664
+143,5,5,1.63376523536751
+143,6,5,2.29734560926664
+143,7,5,3.58238286054971
+143,8,5,4.91327085824494
+144,1,5,0.811978814671721
+144,2,5,1.65001069738936
+144,3,5,1.92530050386947
+144,4,5,2.73902511393377
+144,5,5,4.25416472998866
+144,6,5,4.98233651146248
+144,7,5,6.50313076575639
+144,8,5,7.13923362603824
+145,1,5,1.42018832659402
+145,2,5,2.12203019512399
+145,3,5,2.47549419271258
+145,4,5,2.79695982344628
+145,5,5,4.31985710330443
+145,6,5,5.20566778145388
+145,7,5,6.50710807808323
+145,8,5,7.25253764628188
+146,1,5,0.467187816876626
+146,2,5,0.776203327379875
+146,3,5,1.58690282822873
+146,4,5,1.9252279760618
+146,5,5,3.17601713465207
+146,6,5,4.50551046887625
+146,7,5,5.58620593388508
+146,8,5,6.26271631236738
+147,1,5,0.179163589204672
+147,2,5,0.793213070348187
+147,3,5,1.62995570591685
+147,4,5,1.70516418639264
+147,5,5,3.48446303566426
+147,6,5,4.66418521855636
+147,7,5,5.41086269876853
+147,8,5,6.39593748221464
+148,1,5,-0.329721641230399
+148,2,5,-0.104604986399989
+148,3,5,0.557117626678429
+148,4,5,1.1283800254646
+148,5,5,2.49850643335396
+148,6,5,3.49652631627393
+148,7,5,4.83117268302154
+148,8,5,5.8778316560307
+149,1,5,2.83546914301537
+149,2,5,3.55942504509601
+149,3,5,4.0343478480526
+149,4,5,4.29036331941152
+149,5,5,5.84267818942925
+149,6,5,6.95988152118232
+149,7,5,7.84371289141104
+149,8,5,8.99504393820711
+150,1,5,0.353558327909409
+150,2,5,0.0971312003621931
+150,3,5,0.999285213587224
+150,4,5,1.32465699506173
+150,5,5,3.04946029363458
+150,6,5,4.2541166378007
+150,7,5,5.09876372344629
+150,8,5,5.71906199696452
+151,1,5,0.525750886451906
+151,2,5,0.893396180499992
+151,3,5,2.07896509350654
+151,4,5,2.25867109481025
+151,5,5,3.72877107308203
+151,6,5,4.56557003998835
+151,7,5,5.74034210162108
+151,8,5,6.66484800119341
+152,1,5,-0.959359779342574
+152,2,5,-0.315249756011789
+152,3,5,0.0501839424136046
+152,4,5,1.04222089222118
+152,5,5,1.99002316333137
+152,6,5,3.05414518371304
+152,7,5,4.13750681004899
+152,8,5,5.17931328304503
+153,1,5,-0.475784459340995
+153,2,5,0.038042614793213
+153,3,5,0.384272767841674
+153,4,5,1.22364423384171
+153,5,5,2.44543176058119
+153,6,5,3.41636201138061
+153,7,5,4.35074939454188
+153,8,5,5.49402389634794
+154,1,5,-0.479392940089231
+154,2,5,0.105242923746075
+154,3,5,0.665810253622011
+154,4,5,1.19760855819624
+154,5,5,2.83587415784842
+154,6,5,3.90803611453573
+154,7,5,4.64589043122806
+154,8,5,5.73795478250453
+155,1,5,0.608317423060096
+155,2,5,1.40380927513764
+155,3,5,1.75866339340061
+155,4,5,2.62202958412669
+155,5,5,3.77419704141051
+155,6,5,5.03265720708436
+155,7,5,5.7044836778403
+155,8,5,6.33816195552479
+156,1,5,1.86431961335335
+156,2,5,1.82330603084261
+156,3,5,2.2285891986006
+156,4,5,2.95204440516718
+156,5,5,4.38377615569592
+156,6,5,6.03714325749645
+156,7,5,6.57073648844073
+156,8,5,7.47937160050042
+157,1,5,1.99840370192053
+157,2,5,2.59121177630995
+157,3,5,3.15671154350392
+157,4,5,3.42920370304041
+157,5,5,5.0599439931328
+157,6,5,6.143723692161
+157,7,5,7.1165003634696
+157,8,5,8.2525571950614
+158,1,5,1.03319511937132
+158,2,5,1.32039470544842
+158,3,5,2.11133018212242
+158,4,5,2.66975571142703
+158,5,5,3.96141968710621
+158,6,5,4.75942104932975
+158,7,5,5.95429631606491
+158,8,5,6.92766633257145
+159,1,5,1.03890300454548
+159,2,5,1.222165245985
+159,3,5,2.41987659803987
+159,4,5,2.23505378240698
+159,5,5,4.01387523949909
+159,6,5,5.27211433799155
+159,7,5,5.69558088483459
+159,8,5,7.06135032337447
+160,1,5,1.26288899611135
+160,2,5,1.54092617683566
+160,3,5,2.17184761709794
+160,4,5,3.01475506628533
+160,5,5,4.2263374271918
+160,6,5,5.25666266948215
+160,7,5,6.3595385670447
+160,8,5,7.50241057062099
+161,1,5,0.505126103765275
+161,2,5,0.562896294462252
+161,3,5,1.31207599745235
+161,4,5,2.20151827539788
+161,5,5,3.29283903880832
+161,6,5,4.34411537759276
+161,7,5,5.28441267859029
+161,8,5,6.70739180559209
+162,1,5,-0.0758061262284153
+162,2,5,0.407051283873834
+162,3,5,0.893792739036451
+162,4,5,1.56889992048944
+162,5,5,3.46568416712827
+162,6,5,4.17380092831167
+162,7,5,4.92813412215899
+162,8,5,6.02311921589609
+163,1,5,-0.104307312353201
+163,2,5,0.568610824001807
+163,3,5,0.725597092563344
+163,4,5,0.99041532712622
+163,5,5,2.82907626157414
+163,6,5,3.81925025136067
+163,7,5,4.83817321685703
+163,8,5,6.0862540373602
+164,1,5,1.25321535697964
+164,2,5,2.03833270786335
+164,3,5,2.44594553570962
+164,4,5,2.70600596482496
+164,5,5,4.6855826520896
+164,6,5,5.37092292764573
+164,7,5,6.19475318486653
+164,8,5,7.22234055091044
+165,1,5,1.00664552922359
+165,2,5,1.7171680741074
+165,3,5,2.06937727670354
+165,4,5,2.97730237529092
+165,5,5,4.32308381364884
+165,6,5,5.71856864449326
+165,7,5,6.24525117654903
+165,8,5,7.2052447929068
+166,1,5,-1.4718533836255
+166,2,5,-0.588772870941195
+166,3,5,-0.0624960662781316
+166,4,5,0.491697585570406
+166,5,5,1.69920267162904
+166,6,5,3.37409071109295
+166,7,5,3.83024230236719
+166,8,5,5.00951530150485
+167,1,5,-0.607864283484989
+167,2,5,-0.0418177192546643
+167,3,5,0.698535572057312
+167,4,5,0.901324379263428
+167,5,5,2.79178684877704
+167,6,5,3.72835771969893
+167,7,5,4.54282199057793
+167,8,5,5.72948476464321
+168,1,5,0.200958886180729
+168,2,5,0.802340917781474
+168,3,5,1.27035103196685
+168,4,5,2.19711914880414
+168,5,5,3.23068618552698
+168,6,5,4.46970917688263
+168,7,5,5.42317054951253
+168,8,5,6.50804555661555
+169,1,5,0.0674731172700116
+169,2,5,0.507237516662246
+169,3,5,0.108760678450507
+169,4,5,1.2162416343858
+169,5,5,2.9517080409866
+169,6,5,3.94070623679996
+169,7,5,4.73529607890144
+169,8,5,5.64402582861047
+170,1,5,0.323097510356277
+170,2,5,0.899917182906831
+170,3,5,1.41380536838571
+170,4,5,2.36254857153997
+170,5,5,3.66957487642263
+170,6,5,4.64478093842395
+170,7,5,5.49150398374669
+170,8,5,6.30597612053104
+171,1,5,0.257146298326122
+171,2,5,0.807139154449787
+171,3,5,1.51490378109914
+171,4,5,2.44162215343575
+171,5,5,3.34879387663644
+171,6,5,4.56535689058725
+171,7,5,5.08392715840072
+171,8,5,6.64577434826485
+172,1,5,1.07442579743174
+172,2,5,1.11713008367254
+172,3,5,1.84807148756904
+172,4,5,2.22664278974788
+172,5,5,3.5125714070689
+172,6,5,4.46591009194485
+172,7,5,5.67428613083647
+172,8,5,6.5901028045963
+173,1,5,1.11353773046385
+173,2,5,1.39859094841482
+173,3,5,1.83383044657518
+173,4,5,2.50954782806781
+173,5,5,4.24807100347722
+173,6,5,4.93629127979168
+173,7,5,6.0126683755917
+173,8,5,7.12687807353198
+174,1,5,0.882227861333195
+174,2,5,1.50660300113403
+174,3,5,1.51249412161028
+174,4,5,2.53017611493252
+174,5,5,3.88136850994042
+174,6,5,5.02833646153472
+174,7,5,5.92625459663981
+174,8,5,6.95399935432011
+175,1,5,-0.0970396494888282
+175,2,5,0.226704462041368
+175,3,5,0.33832107775446
+175,4,5,1.14377884803933
+175,5,5,3.00187534622307
+175,6,5,3.51757115304584
+175,7,5,4.69335499585095
+175,8,5,6.01843425497474
+176,1,5,-0.658535943004998
+176,2,5,-0.230320331136193
+176,3,5,0.776663101309249
+176,4,5,1.06224281867045
+176,5,5,2.48730802469369
+176,6,5,3.45136727914191
+176,7,5,4.82824700388281
+176,8,5,5.29942033580146
+177,1,5,1.4255783453699
+177,2,5,2.14014946272135
+177,3,5,2.44472412661084
+177,4,5,2.83742801918473
+177,5,5,4.62576722313449
+177,6,5,5.328331234428
+177,7,5,6.56915410754912
+177,8,5,7.83258416591696
+178,1,5,-0.650135897405137
+178,2,5,-0.205276274709679
+178,3,5,0.317605110928796
+178,4,5,1.15255629084519
+178,5,5,2.24872774444275
+178,6,5,3.44192562386023
+178,7,5,4.1418005071401
+178,8,5,5.28892135599992
+179,1,5,1.05604366436219
+179,2,5,1.42116900211466
+179,3,5,1.90217779285062
+179,4,5,2.32016885606758
+179,5,5,4.02830768350421
+179,6,5,4.81009267727798
+179,7,5,5.87489091008465
+179,8,5,6.91120869560989
diff --git a/diff_diff/two_stage.py b/diff_diff/two_stage.py
index 96eb398f..5255dc9c 100644
--- a/diff_diff/two_stage.py
+++ b/diff_diff/two_stage.py
@@ -2430,8 +2430,6 @@ def _stage2_static(
         att = float(coef[0])
 
         # GMM sandwich variance
-        eps_2 = y_tilde - np.dot(X_2, coef)  # Stage 2 residuals
-
         V = self._compute_gmm_variance(
             df=df,
             unit=unit,
@@ -2443,7 +2441,6 @@ def _stage2_static(
             delta_hat=delta_hat,
             kept_cov_mask=kept_cov_mask,
             X_2=X_2,
-            eps_2=eps_2,
             cluster_ids=df[cluster_var].values,
             survey_weights=survey_weights,
             resolved_survey=resolved_survey,
@@ -2601,7 +2598,6 @@ def _stage2_event_study(
             weights=survey_weights,
             weight_type=survey_weight_type,
         )
-        eps_2 = y_tilde - np.dot(X_2, coef)
 
         # GMM variance for full coefficient vector
         V = self._compute_gmm_variance(
@@ -2615,7 +2611,6 @@ def _stage2_event_study(
             delta_hat=delta_hat,
             kept_cov_mask=kept_cov_mask,
             X_2=X_2,
-            eps_2=eps_2,
             cluster_ids=df[cluster_var].values,
             survey_weights=survey_weights,
             resolved_survey=resolved_survey,
@@ -2727,7 +2722,6 @@ def _stage2_group(
             weights=survey_weights,
             weight_type=survey_weight_type,
         )
-        eps_2 = y_tilde - np.dot(X_2, coef)
 
         # GMM variance
         V = self._compute_gmm_variance(
@@ -2741,7 +2735,6 @@ def _stage2_group(
             delta_hat=delta_hat,
             kept_cov_mask=kept_cov_mask,
             X_2=X_2,
-            eps_2=eps_2,
             cluster_ids=df[cluster_var].values,
             survey_weights=survey_weights,
             resolved_survey=resolved_survey,
@@ -2832,7 +2825,6 @@ def _compute_gmm_variance(
         delta_hat: Optional[np.ndarray],
         kept_cov_mask: Optional[np.ndarray],
         X_2: np.ndarray,
-        eps_2: np.ndarray,
         cluster_ids: np.ndarray,
         survey_weights: Optional[np.ndarray] = None,
         resolved_survey=None,
@@ -2865,9 +2857,9 @@ def _compute_gmm_variance(
         Parameters
         ----------
         X_2 : np.ndarray, shape (n, k)
-            Stage 2 design matrix (treatment indicators).
-        eps_2 : np.ndarray, shape (n,)
-            Stage 2 residuals.
+            Stage 2 design matrix (treatment indicators). The Stage-2 residual
+            ``eps_2`` is re-solved internally from the *exact* Stage-1 residuals
+            (see the exact-residual note below), so it is not a parameter.
         cluster_ids : np.ndarray, shape (n,)
             Cluster identifiers, fit-sample length. Used for the per-cluster
             stage-1 / stage-2 score aggregation (OLS path).
@@ -2923,9 +2915,17 @@ def _compute_gmm_variance(
 
         # eps_10 = Y - X_10 @ gamma_hat
         # Untreated: stage 1 residual (Y - fitted). Treated: Y (X_10 rows = 0).
-        # Reconstruct Y from y_tilde: Y = y_tilde + fitted_stage1
+        # Reconstruct Y from y_tilde: Y = y_tilde + fitted_stage1. Because
+        # y_tilde = Y - fitted_1, the iterative FE in fitted_1 cancel exactly, so
+        # y_vals == Y (independent of the iterative solver's tolerance).
         alpha_i = df[unit].map(unit_fe).values
         beta_t = df[time].map(time_fe).values
+        # Identification mask: obs whose unit AND time FE are both identified by the
+        # untreated Stage-1 fit. Rank-deficient / Proposition-5 obs (NaN FE) keep the
+        # iterative-residual behavior; only identified obs get the exact residuals.
+        identified = np.isfinite(np.asarray(alpha_i, dtype=float)) & np.isfinite(
+            np.asarray(beta_t, dtype=float)
+        )
         alpha_i = np.where(pd.isna(alpha_i), 0.0, alpha_i).astype(float)
         beta_t = np.where(pd.isna(beta_t), 0.0, beta_t).astype(float)
         fitted_1 = alpha_i + beta_t
@@ -2937,21 +2937,22 @@ def _compute_gmm_variance(
 
         y_tilde = df["_y_tilde"].values
         y_vals = y_tilde + fitted_1  # reconstruct Y
+        y_vals_clean = np.nan_to_num(y_vals, nan=0.0)
 
-        # eps_10: for untreated, stage 1 residual; for treated, Y_i (since X_10 rows = 0)
-        eps_10 = np.empty(n)
         omega_0 = omega_0_mask.values
-        eps_10[omega_0] = y_vals[omega_0] - fitted_1[omega_0]  # Stage 1 residual
-        eps_10[~omega_0] = y_vals[~omega_0]  # x_{10i} = 0, so eps_10 = Y
 
         # 1. gamma_hat = (X'_{10} W X_{10})^{-1} (X'_1 W X_2)  [p x k]
-        # With survey weights, both cross-products need W
+        # With survey weights, both cross-products need W. We reuse the SAME
+        # factorization of (X'_{10} W X_{10}) to also solve the exact Stage-1 FE
+        # coefficients theta_exact (see exact-residual note below).
         if survey_weights is not None:
             XtWX_10 = X_10_sparse.T @ X_10_sparse.multiply(survey_weights[:, None])
             Xt1_WX2 = X_1_sparse.T @ (X_2 * survey_weights[:, None])
+            rhs_fe = X_10_sparse.T @ (survey_weights * y_vals_clean)
         else:
             XtWX_10 = X_10_sparse.T @ X_10_sparse  # (p x p) sparse
             Xt1_WX2 = X_1_sparse.T @ X_2  # (p x k) dense
+            rhs_fe = X_10_sparse.T @ y_vals_clean  # (p,) X'_{10} W Y
 
         try:
             solve_XtX = sparse_factorized(XtWX_10.tocsc())
@@ -2961,6 +2962,7 @@ def _compute_gmm_variance(
                 gamma_hat = np.column_stack(
                     [solve_XtX(Xt1_WX2[:, j]) for j in range(Xt1_WX2.shape[1])]
                 )
+            theta_exact = np.asarray(solve_XtX(np.asarray(rhs_fe).ravel())).ravel()
         except RuntimeError as exc:
             # Singular matrix — fall back to dense least-squares. Silent-failure
             # audit axis C: emit a UserWarning on fallback instead of swallowing.
@@ -2973,9 +2975,32 @@ def _compute_gmm_variance(
                 UserWarning,
                 stacklevel=2,
             )
-            gamma_hat = np.linalg.lstsq(XtWX_10.toarray(), Xt1_WX2, rcond=None)[0]
+            XtWX_10_dense = XtWX_10.toarray()
+            gamma_hat = np.linalg.lstsq(XtWX_10_dense, Xt1_WX2, rcond=None)[0]
             if gamma_hat.ndim == 1:
                 gamma_hat = gamma_hat.reshape(-1, 1)
+            theta_exact = np.linalg.lstsq(XtWX_10_dense, np.asarray(rhs_fe).ravel(), rcond=None)[0]
+
+        # Exact Stage-1 / Stage-2 residuals. The point-estimate path uses the
+        # iterative alternating-projection FE solver (`_iterative_fe`), which
+        # converges only to ~1e-7 on unbalanced untreated panels; that error is
+        # negligible for the ATT but perturbs the variance by ~1% relative to the
+        # analytical GMM sandwich. The variance therefore re-solves the Stage-1 FE
+        # EXACTLY using the sparse normal equations already factorized for gamma_hat
+        # (theta_exact), matching R `did2s` to ~1e-7 and mirroring ImputationDiD's
+        # exact-sparse variance path. The shared `_exact_gmm_residuals` helper is
+        # used by BOTH this analytical path and the multiplier bootstrap
+        # (`_compute_cluster_S_scores`) so the influence function is single-sourced.
+        eps_10, eps_2 = self._exact_gmm_residuals(
+            X_1_sparse,
+            theta_exact,
+            y_vals_clean,
+            identified,
+            omega_0,
+            y_tilde,
+            X_2,
+            survey_weights,
+        )
 
         # 2. Per-cluster Stage 1 scores: c_g = sum_{i in g} w_i * x_{10i} * eps_{10i}
         # Only untreated obs have non-zero X_10 rows
@@ -3186,8 +3211,13 @@ def _build_fe_design(
         """
         Build sparse FE design matrices X_1 (all obs) and X_10 (untreated rows only).
 
-        Column layout: [unit_0, ..., unit_{U-2}, time_0, ..., time_{T-2}, cov_1, ..., cov_C]
-        (Drop first unit and first time for identification.)
+        Column layout: [intercept, unit_1, ..., unit_{U-1}, time_1, ..., time_{T-1},
+        cov_1, ..., cov_C] (drop first unit and first time for identification, with an
+        intercept). The intercept makes the column space span the constant (the grand
+        mean); the prior intercept-free layout silently omitted the grand mean from the
+        FE span, which biased the GMM-sandwich residuals when re-solved exactly. With
+        the intercept this is the standard full-rank two-way FE (matches fixest / R
+        ``did2s``).
 
         X_10 is identical to X_1 except that rows for treated observations are zeroed out.
 
@@ -3210,30 +3240,38 @@ def _build_fe_design(
         n_units = len(all_units)
         n_times = len(all_times)
         n_cov = len(covariates) if covariates else 0
-        n_fe_cols = (n_units - 1) + (n_times - 1)
+        # [intercept, unit_1..unit_{U-1}, time_1..time_{T-1}] — the intercept (col 0)
+        # makes the column space span the constant / grand mean (see docstring).
+        n_fe_cols = 1 + (n_units - 1) + (n_times - 1)
 
         def _build_rows(mask=None):
             """Build sparse matrix for given observation mask."""
-            # Unit dummies (drop first)
+            all_rows = np.arange(n)
+
+            # Intercept (col 0): 1 for every (masked) row.
+            i_rows = all_rows if mask is None else all_rows[mask]
+            i_cols = np.zeros(len(i_rows), dtype=int)
+
+            # Unit dummies (drop first) at cols 1..n_units-1
             u_indices = np.array([unit_to_idx[u] for u in unit_vals])
             u_mask = u_indices > 0
             if mask is not None:
                 u_mask = u_mask & mask
 
-            u_rows = np.arange(n)[u_mask]
-            u_cols = u_indices[u_mask] - 1
+            u_rows = all_rows[u_mask]
+            u_cols = u_indices[u_mask]  # 1..n_units-1 (intercept occupies col 0)
 
-            # Time dummies (drop first)
+            # Time dummies (drop first) at cols n_units..n_units+n_times-2
             t_indices = np.array([time_to_idx[t] for t in time_vals])
             t_mask = t_indices > 0
             if mask is not None:
                 t_mask = t_mask & mask
 
-            t_rows = np.arange(n)[t_mask]
-            t_cols = (n_units - 1) + t_indices[t_mask] - 1
+            t_rows = all_rows[t_mask]
+            t_cols = n_units + t_indices[t_mask] - 1
 
-            rows = np.concatenate([u_rows, t_rows])
-            cols = np.concatenate([u_cols, t_cols])
+            rows = np.concatenate([i_rows, u_rows, t_rows])
+            cols = np.concatenate([i_cols, u_cols, t_cols])
             data = np.ones(len(rows))
 
             A_fe = sparse.csr_matrix((data, (rows, cols)), shape=(n, n_fe_cols))
diff --git a/diff_diff/two_stage_bootstrap.py b/diff_diff/two_stage_bootstrap.py
index 2a2a693d..e21aae42 100644
--- a/diff_diff/two_stage_bootstrap.py
+++ b/diff_diff/two_stage_bootstrap.py
@@ -22,7 +22,6 @@
 from diff_diff.bootstrap_utils import (
     generate_survey_multiplier_weights_batch as _generate_survey_multiplier_weights_batch,
 )
-from diff_diff.linalg import solve_ols
 from diff_diff.two_stage_results import TwoStageBootstrapResults
 
 # Maximum number of elements before falling back to per-column sparse aggregation.
@@ -63,6 +62,66 @@ def _compute_gmm_scores(
             s2_by_cluster: np.ndarray,
         ) -> np.ndarray: ...
 
+    @staticmethod
+    def _exact_gmm_residuals(
+        X_1_sparse,
+        theta_exact: np.ndarray,
+        y_vals_clean: np.ndarray,
+        identified: np.ndarray,
+        omega_0: np.ndarray,
+        y_tilde: np.ndarray,
+        X_2: np.ndarray,
+        survey_weights: Optional[np.ndarray],
+    ) -> Tuple[np.ndarray, np.ndarray]:
+        """Exact Stage-1 / Stage-2 residuals for the GMM influence function.
+
+        Given the EXACT Stage-1 FE coefficients ``theta_exact`` (solved from the
+        same ``(X'_{10} W X_{10})`` factorization used for ``gamma_hat``), return the
+        exact Stage-1 residual ``eps_10`` (untreated rows) and the exact Stage-2
+        residual ``eps_2``. **Shared** by the analytical GMM variance
+        (``TwoStageDiD._compute_gmm_variance``) and the multiplier bootstrap
+        (``_compute_cluster_S_scores``) so both build the per-cluster influence
+        score ``S_g = gamma_hat' c_g - X'_{2g} eps_{2g}`` from the same exact
+        residuals. The iterative alternating-projection FE used for the point
+        estimate is only ~1e-7-accurate on unbalanced untreated panels, which
+        perturbs the variance ~1% relative to the analytical sandwich; obs whose FE
+        are unidentified (rank-deficient / Proposition-5) fall back to the iterative
+        residual ``y_tilde`` so those edge cases are unchanged.
+        """
+        n = X_1_sparse.shape[0]
+        fitted_exact = np.asarray(X_1_sparse @ theta_exact).ravel()
+        y_tilde_exact = y_vals_clean - fitted_exact
+        use_exact = identified & np.isfinite(y_tilde_exact)
+        y_tilde_use = np.where(use_exact, y_tilde_exact, y_tilde)
+        eps_10 = np.empty(n)
+        eps_10[omega_0] = y_tilde_use[omega_0]  # exact Stage-1 residual (untreated)
+        eps_10[~omega_0] = y_vals_clean[~omega_0]  # x_{10i} = 0, so value is inert
+        # Exact Stage-2 residual: re-solve delta on the exact residualized outcome
+        # (X_2 already has NaN-y_tilde rows zeroed by the caller, so masked obs
+        # contribute nothing to the normal equations).
+        y_tilde_s2 = np.where(np.isfinite(y_tilde_use), y_tilde_use, 0.0)
+        if survey_weights is not None:
+            XtWX2 = X_2.T @ (X_2 * survey_weights[:, None])
+            XtWy2 = X_2.T @ (survey_weights * y_tilde_s2)
+        else:
+            XtWX2 = X_2.T @ X_2
+            XtWy2 = X_2.T @ y_tilde_s2
+        try:
+            delta_2 = np.linalg.solve(XtWX2, XtWy2)
+        except np.linalg.LinAlgError:
+            # Silent-failure audit convention: warn before the dense fallback.
+            warnings.warn(
+                "TwoStageDiD GMM sandwich: Stage-2 design (X'_2 W X_2) is "
+                "singular; falling back to dense lstsq for the exact-residual "
+                "re-solve. This may indicate collinear treatment/horizon "
+                "indicators and SE estimates may be less reliable.",
+                UserWarning,
+                stacklevel=2,
+            )
+            delta_2 = np.linalg.lstsq(XtWX2, XtWy2, rcond=None)[0]
+        eps_2 = y_tilde_s2 - X_2 @ delta_2
+        return eps_10, eps_2
+
     def _compute_cluster_S_scores(
         self,
         df: pd.DataFrame,
@@ -75,7 +134,6 @@ def _compute_cluster_S_scores(
         delta_hat: Optional[np.ndarray],
         kept_cov_mask: Optional[np.ndarray],
         X_2: np.ndarray,
-        eps_2: np.ndarray,
         cluster_ids: np.ndarray,
         survey_weights: Optional[np.ndarray] = None,
     ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
@@ -91,7 +149,6 @@ def _compute_cluster_S_scores(
         unique_clusters : np.ndarray
             Unique cluster identifiers.
         """
-        n = len(df)
         k = X_2.shape[1]
 
         cov_list = covariates
@@ -103,9 +160,14 @@ def _compute_cluster_S_scores(
         )
         p = X_1_sparse.shape[1]
 
-        # Reconstruct Y and compute eps_10
+        # Reconstruct Y = y_tilde + fitted_1 (the iterative FE cancel, so y_vals == Y).
         alpha_i = df[unit].map(unit_fe).values
         beta_t = df[time].map(time_fe).values
+        # Obs whose unit AND time FE are both identified by the untreated Stage-1
+        # fit; unidentified obs fall back to the iterative residual in the helper.
+        identified = np.isfinite(np.asarray(alpha_i, dtype=float)) & np.isfinite(
+            np.asarray(beta_t, dtype=float)
+        )
         alpha_i = np.where(pd.isna(alpha_i), 0.0, alpha_i).astype(float)
         beta_t = np.where(pd.isna(beta_t), 0.0, beta_t).astype(float)
         fitted_1 = alpha_i + beta_t
@@ -116,20 +178,20 @@ def _compute_cluster_S_scores(
                 fitted_1 = fitted_1 + np.dot(df[cov_list].values, delta_hat)
 
         y_tilde = df["_y_tilde"].values
-        y_vals = y_tilde + fitted_1
-
-        eps_10 = np.empty(n)
+        y_vals_clean = np.nan_to_num(y_tilde + fitted_1, nan=0.0)
         omega_0 = omega_0_mask.values
-        eps_10[omega_0] = y_vals[omega_0] - fitted_1[omega_0]
-        eps_10[~omega_0] = y_vals[~omega_0]
 
-        # gamma_hat — with survey weights, both cross-products need W
+        # gamma_hat — with survey weights, both cross-products need W. The same
+        # (X'_{10} W X_{10}) factorization also yields the EXACT Stage-1 FE
+        # coefficients theta_exact for the exact-residual helper below.
         if survey_weights is not None:
             XtX_10 = X_10_sparse.T @ X_10_sparse.multiply(survey_weights[:, None])
             Xt1_X2 = X_1_sparse.T @ (X_2 * survey_weights[:, None])
+            rhs_fe = X_10_sparse.T @ (survey_weights * y_vals_clean)
         else:
             XtX_10 = X_10_sparse.T @ X_10_sparse
             Xt1_X2 = X_1_sparse.T @ X_2
+            rhs_fe = X_10_sparse.T @ y_vals_clean
 
         try:
             solve_XtX = sparse_factorized(XtX_10.tocsc())
@@ -139,6 +201,7 @@ def _compute_cluster_S_scores(
                 gamma_hat = np.column_stack(
                     [solve_XtX(Xt1_X2[:, j]) for j in range(Xt1_X2.shape[1])]
                 )
+            theta_exact = np.asarray(solve_XtX(np.asarray(rhs_fe).ravel())).ravel()
         except RuntimeError as exc:
             # Silent-failure audit axis C: emit a UserWarning on fallback instead
             # of swallowing the error.
@@ -150,9 +213,25 @@ def _compute_cluster_S_scores(
                 UserWarning,
                 stacklevel=2,
             )
-            gamma_hat = np.linalg.lstsq(XtX_10.toarray(), Xt1_X2, rcond=None)[0]
+            XtX_10_dense = XtX_10.toarray()
+            gamma_hat = np.linalg.lstsq(XtX_10_dense, Xt1_X2, rcond=None)[0]
             if gamma_hat.ndim == 1:
                 gamma_hat = gamma_hat.reshape(-1, 1)
+            theta_exact = np.linalg.lstsq(XtX_10_dense, np.asarray(rhs_fe).ravel(), rcond=None)[0]
+
+        # Exact Stage-1 / Stage-2 residuals (shared with the analytical variance) so
+        # the bootstrap influence function uses the same exact residuals as
+        # _compute_gmm_variance (not the ~1e-7 iterative residualized outcome).
+        eps_10, eps_2 = self._exact_gmm_residuals(
+            X_1_sparse,
+            theta_exact,
+            y_vals_clean,
+            identified,
+            omega_0,
+            y_tilde,
+            X_2,
+            survey_weights,
+        )
 
         # Per-cluster aggregation — survey weights multiply eps_10 before sparse multiply
         if survey_weights is not None:
@@ -307,10 +386,8 @@ def _run_bootstrap(
 
         # Extract survey weights for S-score computation and Stage-2 WLS
         survey_weights: Optional[np.ndarray] = None
-        survey_weight_type: str = "pweight"
         if resolved_survey is not None:
             survey_weights = resolved_survey.weights
-            survey_weight_type = resolved_survey.weight_type
 
         # Handle NaN y_tilde (from unidentified FEs) — matches _stage2_static logic
         nan_mask = ~np.isfinite(y_tilde)
@@ -326,14 +403,6 @@ def _run_bootstrap(
             return None
 
         X_2_static = D.reshape(-1, 1)
-        coef_static = solve_ols(
-            X_2_static,
-            y_tilde,
-            return_vcov=False,
-            weights=survey_weights,
-            weight_type=survey_weight_type,
-        )[0]
-        eps_2_static = y_tilde - np.dot(X_2_static, coef_static)
 
         S_static, bread_static, unique_clusters = self._compute_cluster_S_scores(
             df=df,
@@ -346,7 +415,6 @@ def _run_bootstrap(
             delta_hat=delta_hat,
             kept_cov_mask=kept_cov_mask,
             X_2=X_2_static,
-            eps_2=eps_2_static,
             cluster_ids=cluster_ids,
             survey_weights=survey_weights,
         )
@@ -485,15 +553,6 @@ def _run_bootstrap(
                         if h_int in horizon_to_col:
                             X_2_es[i, horizon_to_col[h_int]] = 1.0
 
-                coef_es = solve_ols(
-                    X_2_es,
-                    y_tilde,
-                    return_vcov=False,
-                    weights=survey_weights,
-                    weight_type=survey_weight_type,
-                )[0]
-                eps_2_es = y_tilde - np.dot(X_2_es, coef_es)
-
                 S_es, bread_es, _ = self._compute_cluster_S_scores(
                     df=df,
                     unit=unit,
@@ -505,7 +564,6 @@ def _run_bootstrap(
                     delta_hat=delta_hat,
                     kept_cov_mask=kept_cov_mask,
                     X_2=X_2_es,
-                    eps_2=eps_2_es,
                     cluster_ids=cluster_ids,
                     survey_weights=survey_weights,
                 )
@@ -556,15 +614,6 @@ def _run_bootstrap(
                     if g in group_to_col:
                         X_2_grp[i, group_to_col[g]] = 1.0
 
-            coef_grp = solve_ols(
-                X_2_grp,
-                y_tilde,
-                return_vcov=False,
-                weights=survey_weights,
-                weight_type=survey_weight_type,
-            )[0]
-            eps_2_grp = y_tilde - np.dot(X_2_grp, coef_grp)
-
             S_grp, bread_grp, _ = self._compute_cluster_S_scores(
                 df=df,
                 unit=unit,
@@ -576,7 +625,6 @@ def _run_bootstrap(
                 delta_hat=delta_hat,
                 kept_cov_mask=kept_cov_mask,
                 X_2=X_2_grp,
-                eps_2=eps_2_grp,
                 cluster_ids=cluster_ids,
                 survey_weights=survey_weights,
             )
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
index ce725563..82cd60dd 100644
--- a/docs/methodology/REGISTRY.md
+++ b/docs/methodology/REGISTRY.md
@@ -1426,6 +1426,7 @@ Our implementation uses multiplier bootstrap on the GMM influence function: clus
 - **Note:** Both the iterative FE solver (`_iterative_fe`, Stage 1) and the iterative alternating-projection demeaning helper (`_iterative_demean`, used in covariate residualization) emit `UserWarning` when `max_iter` exhausts without reaching `tol`, via `diff_diff.utils.warn_if_not_converged`. Silent return of the current iterate was classified as a silent failure under the Phase 2 audit and replaced with an explicit signal to match the logistic/Poisson IRLS pattern in `linalg.py`.
 - **Note:** When the Stage-2 bread `X'_2 W X_2` is singular, both the analytical TSL variance (`two_stage.py`) and the multiplier-bootstrap bread (`two_stage_bootstrap.py`) now emit a `UserWarning` before falling back to `np.linalg.lstsq`. Previously this fallback was silent. Sibling of axis-A finding #17 in the Phase 2 silent-failures audit; surfaced by the repo-wide lstsq-fallback pattern grep that accompanied the StaggeredTripleDifference fix.
 - **Note:** The GMM sandwich and bootstrap paths both use `scipy.sparse.linalg.factorized` for the Stage 1 normal-equations solve `(X'_{10} W X_{10}) gamma = X'_1 W X_2` and fall back to dense `lstsq` when the sparse factorization raises `RuntimeError` on a near-singular matrix. Both fallback sites emit a `UserWarning` (silent-failure audit axis C) so callers know SE estimates came from the degraded path rather than the fast sparse path.
+- **Note:** The GMM sandwich re-solves the Stage-1 unit+time fixed effects **exactly** (sparse OLS reusing the `scipy.sparse.linalg.factorized` factorization of `(X'_{10} W X_{10})` already computed for `gamma_hat`), rather than reusing the iterative alternating-projection FE (`_iterative_fe`) that produces the point estimate. The iterative solver converges only to ~1e-7 on unbalanced *untreated* panels — negligible for the ATT, but enough to perturb the variance by ~1% relative to the analytical GMM sandwich. The exact re-solve makes the analytical GMM SE match R `did2s` to ~1e-7 (`tests/test_methodology_two_stage.py::TestTwoStageDiDParityR`), mirroring ImputationDiD's exact-sparse variance path (`imputation.py` `_build_A_sparse`). It also required adding an **intercept column** to `_build_fe_design` so the first-stage column space spans the constant (the grand mean): the prior intercept-free `[unit_1.., time_1..]` layout (drop first unit + first time, no intercept) silently omitted the grand mean, which the exact residual is first-order sensitive to (the iterative point-estimate solver absorbs the grand mean into its mean-based FE, so the point estimate was unaffected). Obs whose unit or time FE are unidentified (NaN; rank-deficient / Proposition-5) fall back to the iterative residual, so those edge cases are unchanged. The reported `overall_att` still uses the iterative FE (preserving point-estimate equivalence with ImputationDiD at 1e-10); only the variance uses the exact residuals.
 - **Note:** `vcov_type` is permanently narrow to `{"hc1"}` (Phase 1b threading). TwoStageDiD's variance is the Gardner (2022) two-stage GMM cluster-sandwich `V = (X'_2 W X_2)^{-1} (S' S) (X'_2 W X_2)^{-1}` with the per-cluster GMM-corrected score `S_g = gamma_hat' c_g - X'_{2g} eps_{2g}`. Analytical-sandwich families `{classical, hc2, hc2_bm}` are rejected at `__init__`/`fit()`: the GMM-corrected meat folds first-stage FE estimation uncertainty into the score via the `gamma_hat' c_g` term, so there is no single hat matrix spanning both stages on which HC2 leverage or Bell-McCaffrey Satterthwaite DOF can be defined, and the Gardner first-stage correction has not been derived for the leverage-corrected or homoskedastic meat structures (no reference implementation — `clubSandwich` covers single-equation WLS/OLS CR2, not two-stage GMM; mirrors the SpilloverDiD `vcov_type="classical"` rejection). `cluster=<col>` selects the cluster level; `cluster=None` (the default) still clusters at the `unit` column (`cluster_var = unit`), so the summary renders `"CR1 cluster-robust at <unit>, G=<n_units>"` rather than the generic `"HC1"` label. **Note (deviation from R):** the did2s GMM sandwich uses NO finite-sample multiplier (meat `= S' S`), so the rendered `CR1` family label carries no Stata-style `(n-1)/(n-p)` or `G/(G-1)` factor (matches R `did2s`; same FSA-free convention as ImputationDiD's Theorem 3 variance). Under bootstrap (`n_bootstrap > 0`) the analytical variance-family label is suppressed in `summary()` because `fit()` overwrites the reported SE/CI/p-value with `bootstrap_results` (mirrors `DiDResults` at `results.py:213-226`). `cluster=<col>` combined with a replicate-weight survey design raises `NotImplementedError` (replicate-refit variance ignores `cluster=`). `vcov_type='conley'` is deferred to the TwoStageDiD Conley follow-up row in TODO.md.
 
 **Reference implementation(s):**
@@ -1438,6 +1439,7 @@ Our implementation uses multiplier bootstrap on the GMM influence function: clus
 - [x] GMM sandwich variance (Newey & McFadden 1994 Theorem 6.1)
 - [x] Global `(D'D)^{-1}` in variance (faithful to Gardner §3.3 / Newey-McFadden GMM sandwich; matches R `did2s`)
 - [x] No finite-sample adjustment (raw asymptotic sandwich)
+- [x] Analytical GMM SE matches R `did2s` to ~1e-7 (exact Stage-1 re-solve; `TestTwoStageDiDParityR`)
 - [x] Always-treated units excluded with warning
 - [x] Multiplier bootstrap on GMM influence function
 - [x] Event study and overall ATT aggregation
diff --git a/docs/methodology/papers/gardner-2022-review.md b/docs/methodology/papers/gardner-2022-review.md
index 1a626147..322ea07b 100644
--- a/docs/methodology/papers/gardner-2022-review.md
+++ b/docs/methodology/papers/gardner-2022-review.md
@@ -102,9 +102,9 @@ where `W_gpit = [Y_gpit, (1g)_gpit, (1p)_gpit, D_gp]`. The first moment block is
 - [x] **Variance = joint-GMM Newey-McFadden Thm 6.1 sandwich with the GLOBAL Jacobian inverse, meat clustered at the unit** (Appendix B `vce(cluster id)`); **no finite-sample multiplier**. Library matches this (the global-bread sandwich `(X'₂WX₂)^{-1}(S'S)(X'₂WX₂)^{-1}` with unit-clustered GMM-corrected score `S_g = γ̂'c_g − X'_{2g}ε_{2g}`).
 - [x] Always-treated exclusion (fn 19); covariates in both stages (fn 9, `covariates=` — the in-both-regressions approach); `anticipation` handling.
 - ⚠️ **Paper-permitted but NOT exposed by `TwoStageDiD`** (recorded as scope, not a deliverable): the **`P̄`-average estimand eq. (5)** (duration-restricted Stage-2 sample) and the **full-sample first-stage variant (fn 8)** — both described in the paper as valid modifications, but the library has no public parameter for either (`get_params()` = `anticipation, alpha, cluster, n_bootstrap, bootstrap_weights, seed, rank_deficient_action, horizon_max, pretrends, vcov_type`; Stage 1 is always untreated-only). The fn-9 last-pre-treatment / doubly-robust covariate variants are likewise not exposed (only the in-both-stages approach is).
-- [ ] **Dedicated `tests/test_methodology_two_stage.py`** with eq./section-numbered Verified Components (Stage-1 FE recovery; Stage-2 overall ATT eq. 4 + event-study eq. 6; GMM first-stage-correction behavior; always-treated drop). — *PR-B.*
-- [ ] **R parity fixture vs `did2s`** (`benchmarks/R/generate_did2s_golden.R` + `benchmarks/data/did2s_golden.json` + `did2s_test_panel.csv`); overall ATT + event-study ATT/SE. — *PR-B.*
-- [x] **`two_stage.py` docstring checked — already accurate.** `_compute_gmm_variance`'s docstring (`two_stage.py:2842-2857`) states the code uses the GLOBAL Hessian inverse matching `did2s` and gives the correct sandwich `V = (X'₂X₂)^{-1}(Σ_g S_g S_g')(X'₂X₂)^{-1}`; it does **not** fabricate a paper-Eq.6 deviation, so **no source correction is needed**. The misattribution was confined to the REGISTRY note (fixed in this PR).
+- [x] **Dedicated `tests/test_methodology_two_stage.py`** with eq./section-numbered Verified Components (Stage-1 FE recovery; Stage-2 overall ATT eq. 4 + event-study eq. 6; GMM first-stage-correction behavior; always-treated drop). — *PR-B.*
+- [x] **R parity fixture vs `did2s`** (`benchmarks/R/generate_did2s_golden.R` + `benchmarks/data/did2s_golden.json` + `did2s_test_panel.csv`); overall ATT (abs 1e-6) + event-study ATT (1e-6) / SE (1e-7). — *PR-B.*
+- [x] **`two_stage.py` docstring/formula faithful; variance made exact in PR-B.** `_compute_gmm_variance`'s docstring (`two_stage.py:2842-2857`) states the code uses the GLOBAL Hessian inverse matching `did2s` and gives the correct sandwich `V = (X'₂X₂)^{-1}(Σ_g S_g S_g')(X'₂X₂)^{-1}` — the *formula* needed no correction (the Eq.6 misattribution was confined to the REGISTRY note, fixed in PR-A). However, the PR-B `did2s` SE parity surfaced a ~1% **numerical** gap: the variance derived its residuals from the *iterative* alternating-projection first-stage FE (`_iterative_fe`, converged to ~1e-7 on unbalanced untreated panels) while computing `gamma_hat` exactly. PR-B corrects this by re-solving the Stage-1 FE **exactly** inside the variance (sparse OLS, reusing the `gamma_hat` factorization) and adding an intercept to `_build_fe_design` so its column space spans the grand mean — yielding ~1e-7 `did2s` SE parity, mirroring ImputationDiD's exact-sparse variance. The point estimate stays on the iterative FE (twin-equivalence preserved). See REGISTRY `## TwoStageDiD` Notes.
 
 ---
 
@@ -152,4 +152,4 @@ No regularization, bandwidth, factor-count, or cross-validation tuning (the esti
 | **"No finite-sample multiplier" deviation note** (REGISTRY `## TwoStageDiD`). | **Accurate and correctly labeled.** Appendix B's `onestep`/`vce(cluster id)` GMM carries no `(n-1)/(n-p)` or `G/(G-1)` factor; the library matches (meat `= S'S`). | **No change** — verified faithful. |
 | **Version.** Reviewed arXiv:2207.05943**v1** (title page "April 2021"; arXiv-stamped 13 Jul 2022). | Equation numbers are this version's. The misattributions above are **independent of versioning**: even if a later version renumbered the variance to "(6)", the per-cluster-inverse claim is false regardless (the variance uses a global inverse). | Citation pinned to **Gardner (2022)** per `docs/references.rst`. |
 
-**PR-B deliverables** (tracked in `TODO.md` Testing/Docs): `tests/test_methodology_two_stage.py` (Stage-1/Stage-2/GMM-correction/always-treated Verified Components); `did2s` R parity fixture; then flip the `METHODOLOGY_REVIEW.md` `TwoStageDiD` row **In Progress → Complete**. (No `two_stage.py` source change is required — the docstring is already accurate; the only documentation defect, the REGISTRY misattributions, is fixed in this PR.)
+**PR-B deliverables** (tracked in `TODO.md` Testing/Docs): `tests/test_methodology_two_stage.py` (Stage-1/Stage-2/GMM-correction/always-treated Verified Components); `did2s` R parity fixture; then flip the `METHODOLOGY_REVIEW.md` `TwoStageDiD` row **In Progress → Complete**. (The variance *formula* and docstring needed no correction; the only documentation defect, the REGISTRY misattributions, was fixed in PR-A. PR-B did make one **numerical-accuracy** source change — re-solving the Stage-1 FE exactly inside `_compute_gmm_variance` + an intercept in `_build_fe_design` — so the analytical GMM SE matches R `did2s` to ~1e-7; see the checklist item above and REGISTRY `## TwoStageDiD` Notes.)
diff --git a/tests/test_methodology_two_stage.py b/tests/test_methodology_two_stage.py
new file mode 100644
index 00000000..5adc6d80
--- /dev/null
+++ b/tests/test_methodology_two_stage.py
@@ -0,0 +1,820 @@
+"""Methodology verification tests for TwoStageDiD.
+
+Targets Gardner, J. (2022), *Two-stage differences in differences*,
+arXiv:2207.05943 [econ.EM]. Each Verified Component class maps to a numbered
+section / equation of the paper, verified against the source PDF in
+``docs/methodology/papers/gardner-2022-review.md``:
+
+- **§3, eqs. (4) / (6)** — the two-stage procedure (Stage 1 fits unit+time FE on
+  the untreated set Omega_0 only; Stage 2 regresses the residualized outcome on
+  treatment indicators) recovers the overall ATT ``E(beta_gp | D_gp = 1)`` (eq. 4)
+  and the event-study horizons (Step 2' / eq. 6) under arbitrary heterogeneity
+  (``TestGardner2022Section3TwoStageProcedure``).
+- **§3.3 + Appendix B** — the joint-GMM Newey-McFadden (1994) Theorem 6.1
+  sandwich with the GLOBAL Jacobian inverse, meat clustered at the unit, NO
+  finite-sample multiplier; the variance folds in Stage-1 estimation error via
+  the ``gamma_hat' c_g`` correction (``TestGardner2022Section33GMMVariance``).
+- **fn. 19 + Proposition 5 of Borusyak et al. (2024)** — always-treated units are
+  excluded (no untreated obs for Stage 1); without never-treated units horizons
+  ``h >= h_bar = max(groups) - min(groups)`` are not identified -> NaN + warning;
+  REGISTRY edge cases ``balance_e`` and zero-obs cohorts
+  (``TestGardner2022Identification``).
+- Library extensions / deviations (multiplier bootstrap on the GMM influence
+  function; ``vcov_type`` narrowed to the GMM sandwich) -- Gardner prescribes
+  analytical GMM SEs only (``TestGardner2022LibraryDeviations``).
+
+R-parity (bottom of file, NOT a methodology walk-through): ``TestTwoStageDiDParityR``
+pins Python output against R ``did2s::did2s()`` on a fixed-seed golden. R ``did2s``
+defaults to analytical corrected clustered SEs (``bootstrap = FALSE``), the same
+GMM sandwich the library computes; see ``docs/methodology/REGISTRY.md``
+``## TwoStageDiD``.
+
+Point estimates coincide with ImputationDiD (Borusyak, Jaravel & Spiess 2024) by
+construction -- *functional* equivalence is covered by
+``tests/test_two_stage.py::TestTwoStageDiDEquivalence``; this file asserts the
+paper-grounded contract and the did2s cross-language parity.
+
+See also:
+
+- ``docs/methodology/papers/gardner-2022-review.md`` (primary-source review)
+- ``docs/methodology/REGISTRY.md`` ``## TwoStageDiD`` block
+- ``METHODOLOGY_REVIEW.md`` ``TwoStageDiD`` section
+- ``tests/test_two_stage.py`` (implementation-detail unit tests)
+- ``benchmarks/R/generate_did2s_golden.R`` (R golden generator)
+"""
+
+from __future__ import annotations
+
+import json
+import warnings
+from pathlib import Path
+from typing import Any, Dict, List, Optional
+
+import numpy as np
+import pandas as pd
+import pytest
+
+from diff_diff import ImputationDiD, TwoStageDiD
+
+# =============================================================================
+# Module-level R-fixture availability + per-class seed decorrelation
+# =============================================================================
+
+GOLDEN_PATH = Path(__file__).parent.parent / "benchmarks" / "data" / "did2s_golden.json"
+PANEL_PATH = Path(__file__).parent.parent / "benchmarks" / "data" / "did2s_test_panel.csv"
+_R_FIXTURE_AVAILABLE = GOLDEN_PATH.is_file() and PANEL_PATH.is_file()
+
+_BASE_SEED_PROC = 7101
+_BASE_SEED_VAR = 7202
+_BASE_SEED_IDENT = 7303
+_BASE_SEED_DEV = 7404
+
+
+# =============================================================================
+# Helpers
+# =============================================================================
+
+
+def _make_staggered_panel(
+    rng: np.random.Generator,
+    *,
+    cohorts: List[int],
+    n_per_cohort: int = 100,
+    n_periods: int = 6,
+    tau_constant: Optional[float] = None,
+    tau_by_horizon: Optional[Dict[int, float]] = None,
+    sigma: float = 0.1,
+    include_never_treated: bool = True,
+    pretrend_slope: float = 0.0,
+) -> pd.DataFrame:
+    """Balanced staggered-adoption panel satisfying parallel trends.
+
+    DGP (Gardner eq. 1): ``y_it = c_i + beta_t + w_it * tau_{K_it} + u_it``,
+    with ``c_i ~ N(0,1)``, common time trend ``beta_t = 0.5 t`` (parallel
+    trends hold -- no cohort-specific trends unless ``pretrend_slope != 0``),
+    ``u_it ~ N(0, sigma^2)``. Treatment is absorbing from the cohort's event
+    date. ``first_treat = 0`` denotes never-treated.
+    """
+    if tau_constant is None and tau_by_horizon is None:
+        tau_constant = 1.0
+    rows: List[Dict[str, Any]] = []
+    unit_id = 0
+    all_cohorts = ([0] + list(cohorts)) if include_never_treated else list(cohorts)
+    cohort_rank = {g: r for r, g in enumerate(sorted(cohorts))}
+    for g in all_cohorts:
+        for _ in range(n_per_cohort):
+            c_i = rng.standard_normal()
+            for t in range(1, n_periods + 1):
+                beta_t = 0.5 * t
+                u = sigma * rng.standard_normal()
+                treated = g > 0 and t >= g
+                if treated:
+                    k = t - g
+                    if tau_by_horizon is not None:
+                        tau = tau_by_horizon.get(k, 0.0)
+                    else:
+                        tau = tau_constant if tau_constant is not None else 0.0
+                else:
+                    tau = 0.0
+                trend = pretrend_slope * cohort_rank.get(g, 0) * t if g > 0 else 0.0
+                y = c_i + beta_t + trend + (tau if treated else 0.0) + u
+                rows.append(
+                    {
+                        "unit": unit_id,
+                        "time": t,
+                        "first_treat": g,
+                        "outcome": y,
+                    }
+                )
+            unit_id += 1
+    return pd.DataFrame(rows)
+
+
+# =============================================================================
+# Section 3, eqs. (4) / (6) — the two-stage procedure
+# =============================================================================
+
+
+class TestGardner2022Section3TwoStageProcedure:
+    """§3 (eqs. 4/6): Stage 1 fits unit+time FE on the untreated subsample
+    (Omega_0), Stage 2 regresses the residualized outcome on treatment status;
+    this recovers the overall ATT (eq. 4) and event-study horizons (eq. 6)."""
+
+    def test_recovers_overall_att_eq4(self) -> None:
+        """Under a constant effect tau=2.0, the overall ATT E(beta|D=1) (eq. 4)
+        is recovered. DGP: 2 cohorts + never-treated, N=300, sigma=0.1."""
+        rng = np.random.default_rng(_BASE_SEED_PROC + 1)
+        panel = _make_staggered_panel(
+            rng, cohorts=[3, 4], n_per_cohort=100, tau_constant=2.0, sigma=0.1
+        )
+        res = TwoStageDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        assert abs(res.overall_att - 2.0) < 0.05
+
+    def test_recovers_heterogeneous_event_study_eq6(self) -> None:
+        """Horizon-specific effects tau_K = 1 + 0.5*K are recovered per horizon
+        via the Step 2' event-study spec (eq. 6)."""
+        rng = np.random.default_rng(_BASE_SEED_PROC + 2)
+        tau_by_h = {0: 1.0, 1: 1.5, 2: 2.0, 3: 2.5}
+        panel = _make_staggered_panel(
+            rng, cohorts=[2, 3], n_per_cohort=120, tau_by_horizon=tau_by_h, sigma=0.1
+        )
+        res = TwoStageDiD().fit(
+            panel,
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            aggregate="event_study",
+        )
+        assert res.event_study_effects is not None
+        for h, expected in tau_by_h.items():
+            assert h in res.event_study_effects, f"missing horizon {h}"
+            got = res.event_study_effects[h]["effect"]
+            assert abs(got - expected) < 0.06, f"h={h}: {got:.4f} vs {expected}"
+
+    def test_stage1_uses_untreated_only(self) -> None:
+        """Perturbing a single treated outcome by delta shifts the overall ATT by
+        exactly delta/N_treated -- proving treated observations never feed back
+        into the Stage-1 counterfactual model (FE fit on Omega_0 only)."""
+        rng = np.random.default_rng(_BASE_SEED_PROC + 3)
+        panel = _make_staggered_panel(
+            rng, cohorts=[3, 4], n_per_cohort=60, tau_constant=1.0, sigma=0.1
+        )
+        base = TwoStageDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        n_treated = int(
+            ((panel["first_treat"] > 0) & (panel["time"] >= panel["first_treat"])).sum()
+        )
+
+        perturbed = panel.copy()
+        treated_idx = perturbed.index[
+            (perturbed["first_treat"] > 0) & (perturbed["time"] >= perturbed["first_treat"])
+        ][0]
+        delta = 100.0
+        perturbed.loc[treated_idx, "outcome"] += delta
+        pert = TwoStageDiD().fit(
+            perturbed, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        # Only the perturbed obs's own residual changes (Stage-2 weight 1/N_treated).
+        assert abs((pert.overall_att - base.overall_att) - delta / n_treated) < 1e-6
+
+    def test_coincides_with_imputation_estimand(self) -> None:
+        """Gardner's two-stage estimator coincides in point estimates with the
+        imputation estimator (BJS 2024 p. 3258 / paper review "Relation"). The
+        overall ATT matches ImputationDiD to machine precision (they differ only
+        in the variance estimator)."""
+        rng = np.random.default_rng(_BASE_SEED_PROC + 4)
+        panel = _make_staggered_panel(
+            rng, cohorts=[2, 4], n_per_cohort=70, n_periods=6, tau_constant=1.5, sigma=0.2
+        )
+        ts = TwoStageDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        imp = ImputationDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        assert ts.overall_att == pytest.approx(imp.overall_att, abs=1e-10)
+
+    def test_covariates_residualized_in_both_stages(self) -> None:
+        """The live ``covariates=`` path (fn. 9 in-both-stages): a time-varying
+        covariate correlated with treatment status would bias the ATT if ignored,
+        but residualizing it in both stages (Stage-1 delta_hat fit on Omega_0,
+        subtracted from all obs) recovers the true ATT.
+
+        DGP: y = c_i + beta_t + theta*x + tau*D + u, with x shifted by +s on
+        treated-post obs (so x is correlated with D). Omitting x leaves theta*x in
+        y_tilde -> loads onto D -> ATT biased by ~theta*s; including x removes it.
+        """
+        rng = np.random.default_rng(_BASE_SEED_PROC + 5)
+        theta, tau, shift = 2.0, 1.0, 1.0
+        rows: List[Dict[str, Any]] = []
+        uid = 0
+        for g in (0, 3, 4):
+            for _ in range(80):
+                c_i = rng.standard_normal()
+                for t in range(1, 7):
+                    treated = g > 0 and t >= g
+                    x = rng.standard_normal() + (shift if treated else 0.0)
+                    y = (
+                        c_i
+                        + 0.5 * t
+                        + theta * x
+                        + (tau if treated else 0.0)
+                        + 0.1 * rng.standard_normal()
+                    )
+                    rows.append({"unit": uid, "time": t, "first_treat": g, "x": x, "outcome": y})
+                uid += 1
+        panel = pd.DataFrame(rows)
+
+        att_no_cov = (
+            TwoStageDiD()
+            .fit(panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat")
+            .overall_att
+        )
+        att_cov = (
+            TwoStageDiD()
+            .fit(
+                panel,
+                outcome="outcome",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+                covariates=["x"],
+            )
+            .overall_att
+        )
+        # Without adjustment, the confounded covariate biases the ATT well away
+        # from tau; with adjustment, the true ATT is recovered.
+        assert abs(att_no_cov - tau) > 0.5, f"expected bias, got {att_no_cov:.4f}"
+        assert abs(att_cov - tau) < 0.05, f"covariate path failed: {att_cov:.4f}"
+
+
+# =============================================================================
+# Section 3.3 + Appendix B — joint-GMM Newey-McFadden Theorem 6.1 variance
+# =============================================================================
+
+
+class TestGardner2022Section33GMMVariance:
+    """§3.3 + App. B: the GMM sandwich with the GLOBAL Jacobian inverse, meat
+    clustered at the unit, NO finite-sample multiplier; the variance folds in
+    Stage-1 estimation error via the gamma_hat' c_g correction."""
+
+    def test_overall_se_finite_and_positive(self) -> None:
+        rng = np.random.default_rng(_BASE_SEED_VAR + 1)
+        panel = _make_staggered_panel(
+            rng, cohorts=[3, 4], n_per_cohort=80, tau_constant=1.0, sigma=0.2
+        )
+        res = TwoStageDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        assert np.isfinite(res.overall_se) and res.overall_se > 0
+
+    def test_event_study_ses_finite(self) -> None:
+        rng = np.random.default_rng(_BASE_SEED_VAR + 2)
+        panel = _make_staggered_panel(
+            rng, cohorts=[2, 3], n_per_cohort=80, tau_constant=1.0, sigma=0.2
+        )
+        res = TwoStageDiD().fit(
+            panel,
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            aggregate="event_study",
+        )
+        assert res.event_study_effects is not None
+        for h, eff in res.event_study_effects.items():
+            # Skip the normalized reference period (effect=se=0 by construction).
+            if h >= 0 and np.isfinite(eff["effect"]):
+                assert np.isfinite(eff["se"]) and eff["se"] > 0, f"h={h}"
+
+    def test_first_stage_correction_is_nontrivial(self) -> None:
+        """The GMM sandwich folds Stage-1 FE estimation uncertainty into the score
+        via the gamma_hat' c_g correction. On a design with limited untreated
+        support per FE (so the first-stage uncertainty is non-negligible), the
+        GMM SE is materially LARGER than a naive benchmark that treats the
+        residualized outcome y_tilde as raw data and ignores first-stage
+        estimation -- the iid SE of its mean, ``std(y_tilde) / sqrt(N_treated)``.
+
+        Asserts a directional/magnitude gap (not bare inequality) so a
+        near-homoskedastic coincidence can't pass it. The overall ATT is the mean
+        of ``tau_hat`` (= y_tilde for treated obs) over treated observations, so
+        the naive floor is exactly the no-first-stage SE of that mean.
+        """
+        rng = np.random.default_rng(_BASE_SEED_VAR + 3)
+        # Few periods + modest units -> each time/unit FE rests on little
+        # untreated support, so the first-stage correction visibly inflates the SE.
+        panel = _make_staggered_panel(
+            rng, cohorts=[2, 3], n_per_cohort=40, n_periods=4, tau_constant=1.0, sigma=0.5
+        )
+        res = TwoStageDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        gmm_se = res.overall_se
+
+        # Naive benchmark: y_tilde for the treated obs is exposed as `tau_hat`;
+        # the overall ATT is its (unweighted) mean. The SE of that mean IGNORING
+        # the generated-regressand (first-stage) correction is std/sqrt(N_treated).
+        tau_hat = res.treatment_effects["tau_hat"].values
+        tau_hat = tau_hat[np.isfinite(tau_hat)]
+        naive_floor = float(np.std(tau_hat, ddof=1)) / np.sqrt(len(tau_hat))
+
+        assert np.isfinite(gmm_se) and gmm_se > 0
+        assert gmm_se > naive_floor * 1.05, (
+            f"GMM SE {gmm_se:.5f} should exceed the no-first-stage floor "
+            f"{naive_floor:.5f} by a clear margin (the gamma_hat' c_g correction)"
+        )
+
+    def test_no_fsa_global_inverse_se_pin(self) -> None:
+        """Regression pin of the overall SE on a fixed-seed panel. Locks the
+        global-inverse + no-finite-sample-multiplier GMM sandwich convention; a
+        revert to a per-cluster inverse or an FSA multiplier would move this
+        number. Deterministic given the seed (the SE computation has no
+        randomness)."""
+        rng = np.random.default_rng(_BASE_SEED_VAR + 7)
+        panel = _make_staggered_panel(
+            rng, cohorts=[2, 4], n_per_cohort=50, n_periods=6, tau_constant=1.0, sigma=0.3
+        )
+        res = TwoStageDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        assert np.isfinite(res.overall_se) and res.overall_se > 0
+        assert res.overall_se == pytest.approx(_SE_PIN, abs=1e-8)
+
+    def test_pretrends_event_study_ses_finite(self) -> None:
+        """With pretrends=True the Stage-2 design adds pre-period leads; their GMM
+        SEs are finite and the overall ATT is unchanged (REGISTRY edge case)."""
+        rng = np.random.default_rng(_BASE_SEED_VAR + 4)
+        panel = _make_staggered_panel(
+            rng, cohorts=[3, 4], n_per_cohort=80, n_periods=6, tau_constant=1.0, sigma=0.2
+        )
+        base = TwoStageDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        res = TwoStageDiD(pretrends=True).fit(
+            panel,
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            aggregate="event_study",
+        )
+        assert res.event_study_effects is not None
+        # At least one pre-period lead is present and has a finite SE.
+        leads = [h for h in res.event_study_effects if h < 0]
+        assert leads, "pretrends=True should expose pre-period leads"
+        for h in leads:
+            eff = res.event_study_effects[h]
+            if np.isfinite(eff["effect"]):
+                assert np.isfinite(eff["se"]), f"lead h={h} SE non-finite"
+        # Overall ATT is unchanged by requesting pre-period leads.
+        assert res.overall_att == pytest.approx(base.overall_att, abs=1e-8)
+
+    def test_unit_clustered_default(self) -> None:
+        """cluster=None clusters the meat at the unit level (Appendix B
+        vce(cluster id)); n_clusters equals the number of units."""
+        rng = np.random.default_rng(_BASE_SEED_VAR + 5)
+        panel = _make_staggered_panel(
+            rng, cohorts=[3, 4], n_per_cohort=50, tau_constant=1.0, sigma=0.2
+        )
+        res = TwoStageDiD().fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        assert res.n_clusters == panel["unit"].nunique()
+
+    def test_singular_omega0_warns_and_falls_back(self) -> None:
+        """A rank-deficient Stage-1 design (a period observed only among treated
+        obs -> its time FE is unidentified in Omega_0 -> X_10'X_10 singular) makes
+        the sparse factorization in the GMM variance fail; the code must emit a
+        UserWarning and route to the dense lstsq fallback, still returning a
+        finite SE."""
+        rng = np.random.default_rng(_BASE_SEED_VAR + 9)
+        rows: List[Dict[str, Any]] = []
+        uid = 0
+        for g in (0, 2):
+            for _ in range(30):
+                c_i = rng.standard_normal()
+                for t in (1, 2, 3, 4):
+                    if g == 0 and t == 4:
+                        continue  # never-treated not observed at t=4
+                    treated = g > 0 and t >= g
+                    y = c_i + 0.5 * t + (1.0 if treated else 0.0) + 0.1 * rng.standard_normal()
+                    rows.append({"unit": uid, "time": t, "first_treat": g, "outcome": y})
+                uid += 1
+        panel = pd.DataFrame(rows)
+
+        with warnings.catch_warnings(record=True) as caught:
+            warnings.simplefilter("always")
+            res = TwoStageDiD(rank_deficient_action="silent").fit(
+                panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+            )
+        assert any(
+            "lstsq" in str(w.message) for w in caught
+        ), "expected the dense-lstsq fallback warning under a singular Omega_0"
+        assert np.isfinite(res.overall_se)
+
+    def test_bootstrap_scores_use_exact_residuals(self) -> None:
+        """The multiplier bootstrap builds its per-cluster GMM scores from the SAME
+        exact Stage-1 / Stage-2 residuals as the analytical variance (the shared
+        `_exact_gmm_residuals` helper), NOT the ~1e-7 iterative residualized
+        outcome. Since bootstrap SEs override the analytical SE when
+        `n_bootstrap > 0`, a residual mismatch would silently report the pre-fix
+        ~1% approximate SE. White-box: the bootstrap influence variance
+        `bread @ (S'S) @ bread` must equal the analytical `overall_se**2` to
+        machine precision on an unbalanced untreated panel (where iterative and
+        exact residuals differ)."""
+        rng = np.random.default_rng(_BASE_SEED_VAR + 11)
+        panel = _make_staggered_panel(
+            rng, cohorts=[2, 4], n_per_cohort=50, n_periods=6, tau_constant=1.0, sigma=0.3
+        )
+        est = TwoStageDiD()
+        res = est.fit(panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat")
+
+        # Reconstruct the Stage-1 internals the bootstrap score-builder consumes.
+        df = panel.copy()
+        omega_0_mask = ~((df["first_treat"] > 0) & (df["time"] >= df["first_treat"]))
+        unit_fe, time_fe, gm, delta_hat, kcm = est._fit_untreated_model(
+            df, "outcome", "unit", "time", None, omega_0_mask
+        )
+        df["_y_tilde"] = est._residualize(
+            df, "outcome", "unit", "time", None, unit_fe, time_fe, gm, delta_hat
+        )
+        treated = ((df["first_treat"] > 0) & (df["time"] >= df["first_treat"])).values
+        X_2 = treated.astype(float).reshape(-1, 1)
+        S, bread, _ = est._compute_cluster_S_scores(
+            df,
+            "unit",
+            "time",
+            None,
+            omega_0_mask,
+            unit_fe,
+            time_fe,
+            delta_hat,
+            kcm,
+            X_2,
+            df["unit"].values,
+            None,
+        )
+        v_boot = bread @ (S.T @ S) @ bread
+        # Identical to the analytical exact SE (would be ~1% off with iterative residuals).
+        assert np.sqrt(v_boot[0, 0]) == pytest.approx(res.overall_se, abs=1e-9)
+
+    def test_bootstrap_event_study_scores_use_exact_residuals(self) -> None:
+        """Multi-column (event-study) bootstrap GMM scores also use the shared
+        exact-residual helper (CI-review D1). White-box: for a multi-horizon
+        Stage-2 design, the bootstrap influence variance `bread @ (S'S) @ bread`
+        equals the analytical `_compute_gmm_variance` vcov **elementwise** to
+        machine precision — both are single-sourced through `_exact_gmm_residuals`,
+        so the k>1 path cannot silently diverge onto the iterative residuals."""
+        rng = np.random.default_rng(_BASE_SEED_VAR + 12)
+        panel = _make_staggered_panel(
+            rng, cohorts=[2, 4], n_per_cohort=50, n_periods=6, tau_constant=1.0, sigma=0.3
+        )
+        est = TwoStageDiD()
+        est.fit(panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat")
+
+        df = panel.copy()
+        omega_0_mask = ~((df["first_treat"] > 0) & (df["time"] >= df["first_treat"]))
+        unit_fe, time_fe, gm, delta_hat, kcm = est._fit_untreated_model(
+            df, "outcome", "unit", "time", None, omega_0_mask
+        )
+        df["_y_tilde"] = est._residualize(
+            df, "outcome", "unit", "time", None, unit_fe, time_fe, gm, delta_hat
+        )
+        # Multi-column Stage-2 design: indicators for event-time horizons 0 and 1.
+        treated = (df["first_treat"] > 0) & (df["time"] >= df["first_treat"])
+        horizon = (df["time"] - df["first_treat"]).where(treated, other=-1).to_numpy()
+        X_2 = np.column_stack([(horizon == 0).astype(float), (horizon == 1).astype(float)])
+        cluster_ids = df["unit"].to_numpy()
+
+        v_analytical = est._compute_gmm_variance(
+            df,
+            "unit",
+            "time",
+            None,
+            omega_0_mask,
+            unit_fe,
+            time_fe,
+            delta_hat,
+            kcm,
+            X_2,
+            cluster_ids,
+        )
+        S, bread, _ = est._compute_cluster_S_scores(
+            df,
+            "unit",
+            "time",
+            None,
+            omega_0_mask,
+            unit_fe,
+            time_fe,
+            delta_hat,
+            kcm,
+            X_2,
+            cluster_ids,
+            None,
+        )
+        v_boot = bread @ (S.T @ S) @ bread
+        np.testing.assert_allclose(v_boot, v_analytical, atol=1e-9)
+
+
+# Pin value for test_no_fsa_global_inverse_se_pin, produced by the global-inverse
+# GMM sandwich code path (see the test docstring). Deterministic given the
+# fixed-seed design (the SE computation itself has no randomness).
+_SE_PIN = 0.03679572008170665
+
+
+# =============================================================================
+# Identification — always-treated, Proposition 5, edge cases
+# =============================================================================
+
+
+class TestGardner2022Identification:
+    """fn. 19 (always-treated exclusion) + Proposition 5 of Borusyak et al.
+    (2024) (no never-treated -> horizons h >= h_bar unidentified) + REGISTRY
+    edge cases (balance_e, zero-obs cohorts)."""
+
+    def test_always_treated_excluded_with_warning(self) -> None:
+        """Units treated in the first observed period have no untreated obs for
+        Stage 1; they are excluded with a warning listing the IDs, and their
+        treated obs do not enter Stage 2."""
+        rng = np.random.default_rng(_BASE_SEED_IDENT + 1)
+        panel = _make_staggered_panel(
+            rng, cohorts=[3, 4], n_per_cohort=50, n_periods=6, tau_constant=1.0, sigma=0.1
+        )
+        n_treated_orig = (
+            TwoStageDiD()
+            .fit(panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat")
+            .n_treated_units
+        )
+
+        # Force 5 units from a treated cohort to onset at period 1 (= min time),
+        # making them always-treated (no untreated obs for Stage 1).
+        cohort3_units = panel.loc[panel["first_treat"] == 3, "unit"].unique()[:5]
+        panel.loc[panel["unit"].isin(cohort3_units), "first_treat"] = 1
+
+        with pytest.warns(UserWarning, match="treated in all observed periods"):
+            res = TwoStageDiD().fit(
+                panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+            )
+        # The 5 always-treated units are dropped, so the treated-unit count falls
+        # by exactly 5, and their obs do not enter Stage 2.
+        assert res.n_treated_units == n_treated_orig - 5
+        assert np.isfinite(res.overall_att)
+
+    def test_no_never_treated_horizons_nan(self) -> None:
+        """Proposition 5 (Borusyak et al. 2024): with no never-treated units and
+        h_bar = max(groups) - min(groups), horizons h >= h_bar are not identified
+        -> NaN effect with n_obs > 0 and a warning. Regression-pins the behavior
+        implemented at two_stage.py:2531-2674 (mirror of ImputationDiD)."""
+        rng = np.random.default_rng(_BASE_SEED_IDENT + 2)
+        # Cohorts 3 and 5, NO never-treated => h_bar = 5 - 3 = 2.
+        panel = _make_staggered_panel(
+            rng,
+            cohorts=[3, 5],
+            n_per_cohort=80,
+            n_periods=8,
+            tau_constant=1.0,
+            sigma=0.1,
+            include_never_treated=False,
+        )
+        with pytest.warns(UserWarning, match="identified"):
+            res = TwoStageDiD().fit(
+                panel,
+                outcome="outcome",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+                aggregate="event_study",
+            )
+        assert res.event_study_effects is not None
+        h_bar = 2
+        flagged = [
+            h
+            for h, eff in res.event_study_effects.items()
+            if h >= h_bar and np.isnan(eff["effect"])
+        ]
+        assert flagged, "expected NaN horizons at or above h_bar"
+        for h in flagged:
+            assert res.event_study_effects[h]["n_obs"] > 0, f"h={h} should keep n_obs>0"
+
+    def test_balance_e_no_qualifying_cohorts_warns(self) -> None:
+        """balance_e larger than any cohort's available pre-window -> no cohort
+        qualifies -> warning + event study contains only the reference period."""
+        rng = np.random.default_rng(_BASE_SEED_IDENT + 3)
+        panel = _make_staggered_panel(
+            rng, cohorts=[2, 3], n_per_cohort=60, n_periods=5, tau_constant=1.0, sigma=0.1
+        )
+        with pytest.warns(UserWarning, match="balance_e"):
+            res = TwoStageDiD().fit(
+                panel,
+                outcome="outcome",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+                aggregate="event_study",
+                balance_e=10,  # impossible pre-window
+            )
+        assert res.event_study_effects is not None
+        # Only the reference period survives.
+        finite_effects = [
+            h for h, eff in res.event_study_effects.items() if eff.get("n_obs", 0) > 0
+        ]
+        assert finite_effects == [] or all(
+            res.event_study_effects[h]["effect"] == 0.0 for h in res.event_study_effects
+        )
+
+    def test_zero_obs_cohort_nan(self) -> None:
+        """A cohort whose treated obs ALL fall at an unidentified period -> its
+        group effect is NaN with n_obs=0 (REGISTRY edge case). Constructed: period
+        4 is observed only among treated obs (never-treated absent at t=4), so its
+        time FE is unidentified in Ω₀ and every t=4 outcome has NaN y_tilde. Cohort
+        4 is treated only at t=4 -> all its treated obs are NaN -> n_obs=0; cohort 3
+        keeps its identified t=3 effect (n_obs > 0)."""
+        rng = np.random.default_rng(_BASE_SEED_IDENT + 4)
+        rows: List[Dict[str, Any]] = []
+        uid = 0
+        # Never-treated: observed t=1,2,3 only (absent at t=4 -> t=4 has no
+        # untreated obs anywhere -> its time FE is unidentified).
+        for _ in range(40):
+            c_i = rng.standard_normal()
+            for t in (1, 2, 3):
+                rows.append(
+                    {
+                        "unit": uid,
+                        "time": t,
+                        "first_treat": 0,
+                        "outcome": c_i + 0.5 * t + 0.1 * rng.standard_normal(),
+                    }
+                )
+            uid += 1
+        # Cohort 3 (treated t>=3) and cohort 4 (treated only at t=4), t=1..4.
+        for g in (3, 4):
+            for _ in range(40):
+                c_i = rng.standard_normal()
+                for t in (1, 2, 3, 4):
+                    treated = t >= g
+                    y = c_i + 0.5 * t + (1.0 if treated else 0.0) + 0.1 * rng.standard_normal()
+                    rows.append({"unit": uid, "time": t, "first_treat": g, "outcome": y})
+                uid += 1
+        panel = pd.DataFrame(rows)
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")  # rank / NaN-y_tilde / lstsq warnings expected
+            res = TwoStageDiD(rank_deficient_action="silent").fit(
+                panel,
+                outcome="outcome",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+                aggregate="group",
+            )
+        assert res.group_effects is not None
+        # Cohort 4's only treated obs is the unidentified t=4 -> NaN, n_obs=0.
+        assert 4 in res.group_effects
+        c4 = res.group_effects[4]
+        assert c4["n_obs"] == 0, f"cohort 4 should have n_obs=0, got {c4['n_obs']}"
+        assert np.isnan(c4["effect"]), "cohort 4 effect must be NaN"
+        assert np.isnan(c4["se"]) and np.isnan(c4["t_stat"]) and np.isnan(c4["p_value"])
+        # Cohort 3 keeps an identified t=3 effect.
+        assert 3 in res.group_effects and res.group_effects[3]["n_obs"] > 0
+
+
+# =============================================================================
+# Library extensions / deviations
+# =============================================================================
+
+
+class TestGardner2022LibraryDeviations:
+    """Gardner (2022) prescribes analytical GMM SEs only (§3.3 + Appendix B).
+    The library adds a multiplier bootstrap and narrows vcov_type to the GMM
+    sandwich -- both documented in REGISTRY ## TwoStageDiD."""
+
+    def test_multiplier_bootstrap_is_library_extension(self, ci_params) -> None:
+        """The multiplier bootstrap on the GMM influence function runs and yields
+        a finite bootstrap SE; it is a library extension (the paper proposes no
+        bootstrap; R did2s defaults to analytical SEs)."""
+        rng = np.random.default_rng(_BASE_SEED_DEV + 1)
+        panel = _make_staggered_panel(
+            rng, cohorts=[3, 4], n_per_cohort=60, tau_constant=1.0, sigma=0.2
+        )
+        n_boot = ci_params.bootstrap(99)
+        res = TwoStageDiD(n_bootstrap=n_boot, seed=123).fit(
+            panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+        assert res.bootstrap_results is not None
+        assert np.isfinite(res.bootstrap_results.overall_att_se)
+        assert res.bootstrap_results.overall_att_se > 0
+
+    def test_vcov_type_narrow_to_hc1(self) -> None:
+        """vcov_type is permanently narrow to the GMM sandwich: classical/hc2/
+        hc2_bm are rejected at construction (the GMM-corrected meat has no single
+        hat matrix spanning both stages)."""
+        for bad in ("classical", "hc2", "hc2_bm"):
+            with pytest.raises(ValueError, match="rejected"):
+                TwoStageDiD(vcov_type=bad)
+        # conley is deferred (separate error), and an unknown type is invalid.
+        with pytest.raises(ValueError):
+            TwoStageDiD(vcov_type="conley")
+        with pytest.raises(ValueError):
+            TwoStageDiD(vcov_type="bogus")
+
+
+# =============================================================================
+# R parity — did2s::did2s() (skip-guarded golden)
+# =============================================================================
+
+
+@pytest.fixture(scope="module")
+def golden() -> dict:
+    if not _R_FIXTURE_AVAILABLE:
+        pytest.skip(
+            "R did2s parity fixture not present. Run "
+            "`Rscript benchmarks/R/generate_did2s_golden.R` to regenerate "
+            "`benchmarks/data/did2s_golden.json`."
+        )
+    with GOLDEN_PATH.open("r") as f:
+        return json.loads(f.read())
+
+
+@pytest.fixture(scope="module")
+def panel() -> pd.DataFrame:
+    if not _R_FIXTURE_AVAILABLE:
+        pytest.skip("R did2s parity fixture not present.")
+    return pd.read_csv(PANEL_PATH)
+
+
+class TestTwoStageDiDParityR:
+    """Pin Python TwoStageDiD output against R ``did2s::did2s()`` (analytical
+    corrected clustered SEs, bootstrap = FALSE, clustered at unit)."""
+
+    def test_overall_att_matches_r(self, golden: dict, panel: pd.DataFrame) -> None:
+        res = TwoStageDiD().fit(
+            panel, outcome="y", unit="unit", time="time", first_treat="first_treat"
+        )
+        assert res.overall_att == pytest.approx(golden["overall"]["att"], abs=1e-6)
+
+    def test_overall_se_matches_r(self, golden: dict, panel: pd.DataFrame) -> None:
+        res = TwoStageDiD().fit(
+            panel, outcome="y", unit="unit", time="time", first_treat="first_treat"
+        )
+        assert res.overall_se == pytest.approx(golden["overall"]["se"], abs=1e-7)
+
+    def test_event_study_atts_match_r(self, golden: dict, panel: pd.DataFrame) -> None:
+        res = TwoStageDiD().fit(
+            panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            aggregate="event_study",
+        )
+        assert res.event_study_effects is not None
+        es = golden["event_study"]
+        assert len(es["horizons"]) > 0
+        for h, att in zip(es["horizons"], es["att"]):
+            assert h in res.event_study_effects, f"missing horizon {h}"
+            got = res.event_study_effects[h]["effect"]
+            assert np.isfinite(got), f"non-finite ATT at h={h}"
+            assert got == pytest.approx(att, abs=1e-6), f"h={h}"
+
+    def test_event_study_ses_match_r(self, golden: dict, panel: pd.DataFrame) -> None:
+        res = TwoStageDiD().fit(
+            panel,
+            outcome="y",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+            aggregate="event_study",
+        )
+        assert res.event_study_effects is not None
+        es = golden["event_study"]
+        assert len(es["horizons"]) > 0
+        for h, se in zip(es["horizons"], es["se"]):
+            assert h in res.event_study_effects, f"missing horizon {h}"
+            got = res.event_study_effects[h]["se"]
+            assert np.isfinite(got), f"non-finite SE at h={h}"
+            assert got == pytest.approx(se, abs=1e-7), f"h={h}"