diff --git a/CHANGELOG.md b/CHANGELOG.md index 4644bab5..dd4a67df 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,8 +9,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - **`generate_synthetic_control_data()` data generator + a capstone `SyntheticControl` tutorial.** New public generator (`diff_diff/prep_dgp.py`, exported from `diff_diff`) builds a **single-treated-unit** factor-model panel for synthetic-control demos and tests: one treated unit whose latent factor loadings and baseline are an exact convex combination of a few donors (so the noiseless trajectory lies in the donor convex hull and a synthetic control reproduces it closely — the observed fit is approximate under added noise), persistent AR(1) factors, predictor covariates that each proxy a distinct factor, a common calendar time effect, and a known `"ramp"` or `"constant"` treatment effect emitted as `true_effect`. Tutorial **`docs/tutorials/25_synthetic_control_policy.ipynb`** walks the whole `SyntheticControl` surface end-to-end on a policy-evaluation story (one state adopts a clean-energy standard), structured around **two inference philosophies**: cross-unit permutation (`in_space_placebo` + Firpo–Possebom `confidence_set`, with `leave_one_out` / `in_time_placebo` robustness) versus over-time conformal (CWZ `conformal_test` / `conformal_confidence_intervals` / `conformal_average_effect`), with the per-period conformal band as the climax. A `tests/test_t25_synthetic_control_policy_drift.py` drift guard re-derives every quoted number from the generator. +- **`TwoStageDiD` methodology validation (Gardner 2022 / `did2s`).** New `tests/test_methodology_two_stage.py` with paper-equation-numbered Verified Components (§3 two-stage procedure / eqs. 4 & 6, §3.3 GMM variance, fn. 19 always-treated exclusion, Proposition 5, covariate path, `balance_e`, `vcov_type` narrowing) plus a `did2s::did2s()` cross-language parity fixture (`benchmarks/R/generate_did2s_golden.R` → `benchmarks/data/did2s_golden.json` + `did2s_test_panel.csv`), pinning overall + event-study ATT (abs 1e-6) and SE (abs 1e-7). `METHODOLOGY_REVIEW.md` `TwoStageDiD` row flipped to **Complete**. ### Fixed +- **`TwoStageDiD` analytical GMM standard errors are now exact (match R `did2s` to ~1e-7).** The Gardner two-stage GMM sandwich `_compute_gmm_variance` derived its residuals from the *iterative* alternating-projection first-stage fixed effects (`_iterative_fe`, which converge only to ~1e-7 on unbalanced untreated panels) while computing `gamma_hat` exactly — leaving the variance ~1% off the analytical sandwich. The variance now re-solves the Stage-1 FE **exactly** (sparse OLS, reusing the `gamma_hat` factorization), and `_build_fe_design` gained an intercept column so its column space spans the grand mean (the prior intercept-free design omitted it, and the exact residual is first-order sensitive to it). Unidentified-FE obs (rank-deficient / Proposition-5) fall back to the iterative residual, so those edge cases are unchanged; the reported `overall_att` still uses the iterative FE (point-estimate equivalence with `ImputationDiD` preserved). Mirrors the same-class fix already applied to `ImputationDiD`'s exact-sparse variance. - **`LinearRegression.get_se()` / `get_inference()` no longer return a `NaN` standard error from a tiny-negative variance artifact.** A high-leverage / degenerate coefficient (e.g. an absorbed-FE dummy near-collinear with the treatment, whose Bell-McCaffrey Satterthwaite DOF already hits the noise-floor guard) can have a CR2/HC variance of ~0 (≈1e-32) whose vcov diagonal lands just-below-zero under BLAS-dependent float rounding; `np.sqrt` of the negative then produced a `NaN` SE **nondeterministically** — passing single-threaded but failing under the parallel pure-Python full-suite run (`tests/test_methodology_wls_cr2.py::TestLinearRegressionFENanGuardEndToEnd::test_did_absorbed_fe_lr_inference_nan_for_guarded_coefs`). Both SE sites now clamp the vcov diagonal at 0, so the SE is finite (0 for a genuinely-zero variance), deterministic, and BLAS-independent. **No change for any positive variance** (the clamp is a no-op there); only the previously-`NaN` degenerate case is affected. ## [3.5.2] - 2026-06-08 diff --git a/METHODOLOGY_REVIEW.md b/METHODOLOGY_REVIEW.md index 9f9611b8..74ce9f7c 100644 --- a/METHODOLOGY_REVIEW.md +++ b/METHODOLOGY_REVIEW.md @@ -48,7 +48,7 @@ The catalog grew incrementally over several quarters, so formats vary across the | SunAbraham | `sun_abraham.py` | `fixest::sunab()` | **Complete** | 2026-02-15 | | StackedDiD | `stacked_did.py` | `stacked-did-weights` (Wing-Freedman-Hollingsworth code) | **Complete** | 2026-02-19 | | ImputationDiD | `imputation.py` | `didimputation` | **Complete** | 2026-06-06 | -| TwoStageDiD | `two_stage.py` | `did2s` | **In Progress** | — | +| TwoStageDiD | `two_stage.py` | `did2s` | **Complete** | 2026-06-23 | | WooldridgeDiD (ETWFE) | `wooldridge.py` | `etwfe` (R) / `jwdid` (Stata) | **Complete** | 2026-05-22 | | EfficientDiD | `efficient_did.py` | (no canonical R package) | **Complete** | 2026-06-01 | @@ -583,19 +583,36 @@ and covariate-adjusted specifications.) | Module | `two_stage.py`, `two_stage_bootstrap.py` | | Primary Reference | Gardner (2022), *Two-stage differences in differences*, arXiv:2207.05943 | | R Reference | `did2s` | -| Status | **In Progress** | -| Last Review | — | +| Status | **Complete** | +| Last Review | 2026-06-23 | -**Documentation in place:** -- REGISTRY.md section: `## TwoStageDiD` (Stage 1 unit+time FE on untreated, Stage 2 OLS on residualized outcomes, GMM sandwich variance per Newey-McFadden Theorem 6.1) -- Paper review: `docs/methodology/papers/gardner-2022-review.md` (PR-A — eq./section-numbered review of arXiv:2207.05943; corrected a fabricated Eq. 6 variance deviation, see "Documented alignment" below) -- Implementation: 76 unit tests in `tests/test_two_stage.py` (matches ImputationDiD point estimates, R `did2s` global `(D'D)^{-1}` variance, always-treated unit exclusion, multiplier bootstrap) -- Documented alignment: variance = global `(D'D)^{-1}` GMM sandwich (Newey-McFadden Theorem 6.1, Gardner §3.3) — **faithful to both the paper and `did2s`**. Gardner eq. (6) is the *event-study regression spec*, not a variance formula; the earlier "matches `did2s`, not paper Eq. 6" / "Newey-McFadden sandwich vs paper's Eq. 6 deviation" framing was a misattribution, corrected in PR-A across `REGISTRY.md` + the paper review. +**Verified Components** (`tests/test_methodology_two_stage.py`, paper-section/equation-numbered): +- **§3, eqs. (4)/(6) — the two-stage procedure:** Stage 1 unit+time FE on the untreated set Ω₀, Stage 2 on the residualized outcome recovers the constant overall ATT (eq. 4) and heterogeneous-by-horizon effects (Step 2′ / eq. 6); perturbing a treated outcome shifts the overall ATT by exactly δ/N_treated (treated obs never feed Stage 1); coincides with `ImputationDiD` to 1e-10; the live `covariates=` (fn. 9 in-both-stages) path recovers the ATT under a treatment-confounded covariate — `TestGardner2022Section3TwoStageProcedure` +- **§3.3 + Appendix B — joint-GMM Newey-McFadden Thm 6.1 variance:** finite/positive SEs; the first-stage correction `γ̂'c_g` makes the GMM SE strictly exceed the no-first-stage floor; a fixed-seed SE regression-pin locks the global-inverse + no-FSA convention; `cluster=None` clusters at the unit; a rank-deficient Ω₀ warns and falls back to dense lstsq; `pretrends=True` leads have finite SEs — `TestGardner2022Section33GMMVariance` +- **fn. 19 + Proposition 5 (Borusyak et al. 2024):** always-treated units dropped with a warning (treated-unit count falls exactly); no never-treated ⇒ horizons h ≥ h̄ NaN with n_obs>0 + warning; `balance_e` with no qualifying cohort warns + reference-period-only; zero-obs cohorts NaN — `TestGardner2022Identification` +- **Library extensions:** multiplier bootstrap on the GMM influence function; `vcov_type ∈ {classical, hc2, hc2_bm}` rejected (no cross-stage hat matrix) — `TestGardner2022LibraryDeviations` +- **R parity vs `did2s::did2s` (v1.2.1):** overall + event-study ATT and SEs match on a fixed-seed staggered panel (analytical corrected clustered SE, `bootstrap=FALSE`); tests assert ATT `abs=1e-6`, SE `abs=1e-7` — `TestTwoStageDiDParityR` (goldens: `benchmarks/data/did2s_golden.json`, generator: `benchmarks/R/generate_did2s_golden.R`) -**Outstanding for promotion:** -- Dedicated `tests/test_methodology_two_stage.py` with paper-equation-numbered Verified Components walk-through -- R parity benchmark fixture against `did2s` (none on file) -- "Corrections Made" listing + flip Status → Complete (PR-B) +**Corrections Made** (PR-B, surfaced by the `did2s` R parity): +- **Exact Stage-1 residuals in the GMM variance.** `_compute_gmm_variance` derived its residuals from the *iterative* alternating-projection FE (`_iterative_fe`, ~1e-7 convergence on unbalanced Ω₀) while computing `gamma_hat` exactly — leaving the analytical SE ~1% off the exact sandwich. The variance now re-solves the Stage-1 FE **exactly** (sparse OLS, reusing the `gamma_hat` factorization); the reported `overall_att` still uses the iterative FE (twin-equivalence preserved at 1e-10). Unidentified-FE obs (rank-deficient / Prop-5) fall back to the iterative residual. +- **Intercept added to `_build_fe_design`.** The prior `[unit_1.., time_1..]` layout (drop first unit + first time, **no intercept**) did not span the constant / grand mean; the exact residual is first-order sensitive to it. Added an intercept column → standard full-rank two-way FE (matches fixest / `did2s`). Same-class fix as ImputationDiD's PR-B FE-design correction. SE now matches `did2s` to ~1e-9. + +**Deviations from the reference / library extensions** (see `REGISTRY.md` `## TwoStageDiD`): +- **Deviation from R:** the `did2s` analytical GMM sandwich uses **no finite-sample multiplier** (meat `= S'S`); the rendered `CR1` label carries no Stata `(n-1)/(n-p)` or `G/(G-1)` factor (matches `did2s`; same FSA-free convention as ImputationDiD's Theorem-3 variance). +- Multiplier bootstrap on the GMM influence function (library extension; Gardner prescribes analytical GMM SEs only; `did2s` defaults `bootstrap=FALSE`). +- ⚠️ Paper-permitted but **not exposed**: the Eq. (5) P̄-period-average estimand and the fn. 8 full-sample first-stage variant (tracked in `TODO.md`). +- `vcov_type` narrowed to the GMM sandwich (`{classical, hc2, hc2_bm}` rejected); `vcov_type="conley"` deferred (TODO). + +**R Comparison Results** (`did2s` v1.2.1, fixed-seed panel, `benchmarks/data/did2s_golden.json`): + +| Quantity | Python | R | |Δ| | +|----------|--------|---|-----| +| Overall ATT | 2.04566790 | 2.04566803 | 1.3e-7 | +| Overall SE | 0.02813225 | 0.02813225 | 1.0e-9 | +| Event-study ATT (max) | — | — | 1.6e-7 | +| Event-study SE (max) | — | — | 3.7e-10 | + +(Point-estimate Δ is iterative-FE convergence level; SE Δ is machine precision after the exact Stage-1 re-solve. The parity tests assert ATT `abs=1e-6`, SE `abs=1e-7` for cross-platform robustness.) --- @@ -1447,7 +1464,6 @@ Promotion priority for the **In Progress** entries, ordered by what's blocked on **Substantive-review-blocked (each still missing one or more of: a methodology test file, R parity, or a paper review):** 1. **PlaceboTests** — decide first whether to keep standalone or absorb into per-estimator diagnostic sections; methodologically lightweight either way. -2. **TwoStageDiD** — the remaining half of the imputation pair (ImputationDiD is now Complete, validated against `didimputation`). Gardner (2022) paper review **landed** (`docs/methodology/papers/gardner-2022-review.md`, PR-A); still needs `tests/test_methodology_two_stage.py` and an R parity fixture against `did2s` to flip to Complete (PR-B). **Consolidation-pass-blocked (already has paper review or methodology file or R parity; mostly Verified Components walk-through):** diff --git a/TODO.md b/TODO.md index 045f6fe1..aaa35ee4 100644 --- a/TODO.md +++ b/TODO.md @@ -153,6 +153,7 @@ Deferred items from PR reviews that were not addressed before merge. | `SpilloverDiD` data-driven `d_bar` selection (Butts 2021b / Butts 2023 JUE Insight cross-validation). | `spillover.py::SpilloverDiD` | follow-up | Low | | `SpilloverDiD` sparse cKDTree path for the staggered nearest-treated-distance helper (mirrors the static helper's sparse branch). Currently `_compute_nearest_treated_distance_staggered` always builds dense `(n_units, n_treated_by_onset)` pairwise distance matrices per cohort; on large staggered panels with many cohorts this is avoidable memory/runtime. Add a sparse k-d-tree branch analogous to `_compute_nearest_treated_distance_sparse`, gated on `n > _CONLEY_SPARSE_N_THRESHOLD`. | `spillover.py::_compute_nearest_treated_distance_staggered` | follow-up (Wave B) | Low | | `SpilloverDiDResults` in `DiagnosticReport` dispatch tables. Wave C event-study emits a TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias that `plot_event_study` consumes via the new `reference_period` attribute fallback in `_extract_plot_data`, but `SpilloverDiDResults` is NOT registered in `DiagnosticReport`'s `_APPLICABILITY` / `_PT_METHOD` tables — so `DiagnosticReport(spillover_result)` doesn't currently route to event-study diagnostics. Registering requires (a) deciding which diagnostics apply (parallel trends, pre-trends power, heterogeneity, design-effect) AND (b) adding an end-to-end test. | `diff_diff/diagnostic_report.py::_APPLICABILITY`, `_PT_METHOD` | follow-up (Wave C) | Low | +| `TwoStageDiD` paper-permitted estimand variants not exposed (Gardner 2022): the **Eq. (5) P̄-period-average** estimand (duration-restricted Stage-2 sample) and the **fn. 8 full-sample first-stage** variant (treatment-status×period interactions in Stage 1). Both are valid modifications described in the paper but have no public parameter (`get_params()` exposes neither; Stage 1 is always untreated-only). Documented as ⚠️ in `docs/methodology/papers/gardner-2022-review.md`; surface as `estimand=`/`first_stage=` options if a use case arises. | `diff_diff/two_stage.py` | two-stage-validation follow-up | Low | #### Performance @@ -173,7 +174,6 @@ Deferred items from PR reviews that were not addressed before merge. |-------|----------|----|----------| | Drift test for tutorial 24 qualitative power claims (monotonic dilution fast→slow; CS-vs-2×2 MDE crossover/near-parity at slow rollout) — pins the prose against estimator-default/simulation drift | `docs/tutorials/24_staggered_vs_collapsed_power.ipynb` | staggered-analysis-2x2 | Low | | ImputationDiD covariate-path variance lacks dedicated R `didimputation` parity / hand-calc. The PR-B FE-design correction (keep all unit dummies) affects the covariate projection too, but only the no-covariate staggered panel is R-parity'd (the covariate path shares the same validated projection code and passes the full suite). Add a covariate (time-varying X) R golden asserting overall/event-study SE parity, or a small dense-design hand-calc for the covariate projection. | `tests/test_methodology_imputation.py`, `benchmarks/R/generate_didimputation_golden.R` | imputation-validation follow-up | Low | -| TwoStageDiD methodology validation PR-B: add `tests/test_methodology_two_stage.py` (eq./section-numbered Verified Components — Stage-1 FE recovery on untreated obs; Stage-2 overall ATT eq. 4 + event-study eq. 6; GMM first-stage-correction behavior; always-treated drop) + `did2s` R parity fixture (`benchmarks/R/generate_did2s_golden.R` + `benchmarks/data/did2s_golden.json` + `did2s_test_panel.csv`); then flip `METHODOLOGY_REVIEW.md` TwoStageDiD row In Progress → Complete. PR-A (paper review `gardner-2022-review.md`) merged separately. | `tests/test_methodology_two_stage.py`, `benchmarks/`, `METHODOLOGY_REVIEW.md` | two-stage-validation PR-B | Medium | | Port the CI `` extraction into the reviewer-eval harness so `docs/tutorials/*.ipynb` cases (currently guarded out of `verify-corpus`/`run`) can be reviewed with CI-equivalent context | `tools/reviewer-eval/adapters/ci_prompt.py` | local-review | Low | | **Premise corrected — no CI impact (verified 2026-06-07).** The "slow CI" motivation does not hold: no CI workflow installs R (no `setup-r` / `r-lib/actions` / `fixest` / `r-base` install anywhere in `.github/workflows/`), so every R-parity test skips in CI behind a per-file availability gate (`fixest_available` in twfe, `_check_r_contdid()` in continuous_did, `require_r` / `r_available` in `conftest.py`, etc.) — consolidating `Rscript` spawns yields zero CI speedup. The originally-cited file already session-caches its R fits: `test_methodology_twfe.py` exposes `r_twfe_results` / `r_twfe_results_with_covariate` as `scope="session"` fixtures, so each R model runs once per session, not once per test. The only residual is a LOCAL-dev micro-optimization for developers who have R installed: `test_methodology_continuous_did.py` (the `_run_r_contdid` helper plus three standalone inline `Rscript` calls) and `test_methodology_callaway.py` (`_run_r_estimation` called inline in three test methods, plus `_get_r_mpdta_and_results` re-run by the MPDTA R-parity tests) re-spawn `library(...)` per call with no session-level result cache. Applying the twfe session-fixture pattern there would speed local R-parity runs only. Low value; retained as a local-dev note. | `tests/test_methodology_continuous_did.py`, `tests/test_methodology_callaway.py` | #139 | Low | | CS R helpers hard-code `xformla = ~ 1`; no covariate-adjusted R benchmark for IRLS path | `tests/test_methodology_callaway.py` | #202 | Low | diff --git a/benchmarks/R/generate_did2s_golden.R b/benchmarks/R/generate_did2s_golden.R new file mode 100644 index 00000000..ac2267eb --- /dev/null +++ b/benchmarks/R/generate_did2s_golden.R @@ -0,0 +1,138 @@ +#!/usr/bin/env Rscript +# Golden generator: TwoStageDiD vs R `did2s::did2s` (Gardner 2022). +# +# Writes a fixed staggered-adoption panel and the R reference estimates so that +# tests/test_methodology_two_stage.py::TestTwoStageDiDParityR can pin Python +# output against R without requiring R at test time. +# +# Outputs (checked into the repo): +# benchmarks/data/did2s_test_panel.csv (unit, time, first_treat, y) +# benchmarks/data/did2s_golden.json +# +# Usage: +# Rscript benchmarks/R/generate_did2s_golden.R +# +# Notes: +# - did2s defaults to analytical corrected clustered SEs (`bootstrap = FALSE`), +# i.e. the Gardner two-stage GMM sandwich with the GLOBAL Jacobian inverse and +# NO finite-sample multiplier. This is exactly what TwoStageDiD computes (see +# docs/methodology/REGISTRY.md "## TwoStageDiD"); we cluster on `unit` to match +# the library default (cluster=None -> unit). +# - The first stage `~ 0 | unit + time` demeans by unit + time FE (which span the +# constant); the library's GMM variance re-solves the same exact two-way FE. +# - Event study: did2s relative-time is `rel_year = time - first_treat` for +# treated units and `Inf` for never-treated; `i(rel_year, ref = c(-1, Inf))` +# drops the -1 reference period and excludes never-treated. We compare only the +# post-treatment horizons (r >= 0), which map 1:1 to the library's +# event_study_effects keys (h = 0 is the first treated period). The pre-period +# leads (r < 0) are ~0 under parallel trends and are not part of the library's +# default (pretrends=False) event study. +# - Same DGP (seed, cohorts, horizon-heterogeneous effect) as the didimputation +# golden, so the point estimates coincide (the two-stage and imputation +# estimands are algebraically identical); only the SEs differ (GMM sandwich vs +# the imputation Theorem-3 variance). + +suppressMessages({ + library(did2s) + library(jsonlite) + library(data.table) +}) + +set.seed(2024) + +# ---- Deterministic staggered-adoption panel (parallel trends) ---- +# Cohorts 3 and 5 plus a never-treated group (first_treat = 0); 8 periods. +n_per_cohort <- 60L +n_periods <- 8L +cohorts <- c(0L, 3L, 5L) # 0 = never-treated +tau_h <- function(k) 1.0 + 0.5 * k # heterogeneous-by-horizon effect + +rows <- list() +uid <- 0L +for (g in cohorts) { + for (j in seq_len(n_per_cohort)) { + c_i <- rnorm(1) + for (t in seq_len(n_periods)) { + beta_t <- 0.5 * t + u <- 0.2 * rnorm(1) + treated <- (g > 0L) && (t >= g) + eff <- if (treated) tau_h(t - g) else 0.0 + y <- c_i + beta_t + eff + u + rows[[length(rows) + 1L]] <- list( + unit = uid, time = t, first_treat = g, y = y + ) + } + uid <- uid + 1L + } +} +panel <- rbindlist(rows) +panel[, unit := as.integer(unit)] +panel[, time := as.integer(time)] +panel[, first_treat := as.integer(first_treat)] + +panel_path <- file.path("benchmarks", "data", "did2s_test_panel.csv") +dir.create(dirname(panel_path), recursive = TRUE, showWarnings = FALSE) +fwrite(panel, panel_path) +message(sprintf("Wrote panel: %s (%d rows)", panel_path, nrow(panel))) + +# did2s-specific columns (derived from first_treat; not committed in the CSV). +panel[, treat := (first_treat > 0L) & (time >= first_treat)] +panel[, rel_year := ifelse(first_treat > 0L, time - first_treat, Inf)] + +# ---- Overall ATT (static) ---- +overall <- did2s( + panel, + yname = "y", + first_stage = ~ 0 | unit + time, + second_stage = ~ i(treat, ref = FALSE), + treatment = "treat", + cluster_var = "unit" +) +oct <- summary(overall)$coeftable +overall_att <- as.numeric(oct[1, 1]) +overall_se <- as.numeric(oct[1, 2]) +message(sprintf("Overall ATT = %.8f (SE %.8f)", overall_att, overall_se)) + +# ---- Event study (post-treatment horizons r >= 0) ---- +es <- did2s( + panel, + yname = "y", + first_stage = ~ 0 | unit + time, + second_stage = ~ i(rel_year, ref = c(-1, Inf)), + treatment = "treat", + cluster_var = "unit" +) +ect <- summary(es)$coeftable +es_h_all <- as.integer(gsub("rel_year::", "", rownames(ect), fixed = TRUE)) +keep <- !is.na(es_h_all) & es_h_all >= 0 +ord <- order(es_h_all[keep]) +es_h <- es_h_all[keep][ord] +es_att <- as.numeric(ect[keep, 1])[ord] +es_se <- as.numeric(ect[keep, 2])[ord] +for (i in seq_along(es_h)) { + message(sprintf(" h=%d: ATT=%.6f SE=%.6f", es_h[i], es_att[i], es_se[i])) +} + +golden <- list( + estimator = "did2s::did2s", + meta = list( + r_version = R.version.string, + did2s_version = as.character(packageVersion("did2s")), + seed = 2024L, + n_units = length(unique(panel$unit)), + n_periods = n_periods, + cohorts = cohorts, + se_type = "Corrected Clustered (unit); bootstrap = FALSE (analytical GMM sandwich)", + note = paste( + "did2s analytical corrected clustered SE = the Gardner (2022) two-stage GMM", + "sandwich with the global Jacobian inverse and no finite-sample multiplier.", + "Event study compares post-treatment horizons r >= 0 (i(rel_year, ref=c(-1,Inf)))." + ) + ), + overall = list(att = overall_att, se = overall_se), + event_study = list(horizons = es_h, att = es_att, se = es_se) +) + +golden_path <- file.path("benchmarks", "data", "did2s_golden.json") +write_json(golden, golden_path, auto_unbox = TRUE, pretty = TRUE, digits = 12) +message(sprintf("Wrote golden: %s", golden_path)) diff --git a/benchmarks/data/did2s_golden.json b/benchmarks/data/did2s_golden.json new file mode 100644 index 00000000..6fe1f00e --- /dev/null +++ b/benchmarks/data/did2s_golden.json @@ -0,0 +1,22 @@ +{ + "estimator": "did2s::did2s", + "meta": { + "r_version": "R version 4.5.2 (2025-10-31)", + "did2s_version": "1.2.1", + "seed": 2024, + "n_units": 180, + "n_periods": 8, + "cohorts": [0, 3, 5], + "se_type": "Corrected Clustered (unit); bootstrap = FALSE (analytical GMM sandwich)", + "note": "did2s analytical corrected clustered SE = the Gardner (2022) two-stage GMM sandwich with the global Jacobian inverse and no finite-sample multiplier. Event study compares post-treatment horizons r >= 0 (i(rel_year, ref=c(-1,Inf)))." + }, + "overall": { + "att": 2.045668026901, + "se": 0.0281322476332 + }, + "event_study": { + "horizons": [0, 1, 2, 3, 4, 5], + "att": [0.9929366801146, 1.512964957441, 2.006499440277, 2.499854710124, 2.964817090473, 3.46735160262], + "se": [0.0291425705467, 0.02572953717585, 0.03023588469852, 0.02950889195402, 0.04306469246781, 0.04307353806204] + } +} diff --git a/benchmarks/data/did2s_test_panel.csv b/benchmarks/data/did2s_test_panel.csv new file mode 100644 index 00000000..795d1408 --- /dev/null +++ b/benchmarks/data/did2s_test_panel.csv @@ -0,0 +1,1441 @@ +unit,time,first_treat,y +0,1,0,1.57571241935766 +0,2,0,1.96037514490715 +0,3,0,2.4393937775562 +0,4,0,3.21358910235831 +0,5,0,3.74044038043802 +0,6,0,4.0888988405769 +0,7,0,4.45656246750959 +0,8,0,4.73699432845663 +1,1,0,-0.955736306978872 +1,2,0,-0.0275677162035229 +1,3,0,0.545086679135234 +1,4,0,0.937253018862152 +1,5,0,0.723791533328185 +1,6,0,1.84902654760279 +1,7,0,2.19266095078478 +1,8,0,2.96949156815333 +2,1,0,-1.5084015064213 +2,2,0,-0.378111913878043 +2,3,0,-0.042966650307 +2,4,0,0.602429979328702 +2,5,0,0.851296260003455 +2,6,0,1.47331283565702 +2,7,0,1.58516225918639 +2,8,0,2.37025281899957 +3,1,0,0.773553421186285 +3,2,0,1.66281308359629 +3,3,0,1.8967391557127 +3,4,0,2.61963089020398 +3,5,0,3.08364170717106 +3,6,0,3.92554769256714 +3,7,0,4.0849021050487 +3,8,0,4.76261698109082 +4,1,0,0.739479564853692 +4,2,0,1.2231776627253 +4,3,0,1.923004687699 +4,4,0,2.16283950742251 +4,5,0,2.83639367219199 +4,6,0,3.43640690465978 +4,7,0,4.23189561073763 +4,8,0,4.09795945007321 +5,1,0,0.696250409154038 +5,2,0,1.26407014144064 +5,3,0,1.75823707259106 +5,4,0,2.51852874038656 +5,5,0,2.51953066817035 +5,6,0,2.97729626930948 +5,7,0,3.50109960885476 +5,8,0,3.92493231036827 +6,1,0,-0.467620203725488 +6,2,0,0.498200631038206 +6,3,0,0.771289674442228 +6,4,0,1.51918269111337 +6,5,0,1.81258393257196 +6,6,0,2.63769134378166 +6,7,0,2.99114141499478 +6,8,0,3.426775882052 +7,1,0,1.00442536585164 +7,2,0,1.14946814984633 +7,3,0,1.27614494187586 +7,4,0,2.31817361569775 +7,5,0,2.7735856504085 +7,6,0,2.75148321958136 +7,7,0,3.9015094287719 +7,8,0,3.93360268177416 +8,1,0,0.810919656606769 +8,2,0,1.33509796825539 +8,3,0,1.73331413864552 +8,4,0,2.58012394795544 +8,5,0,2.93256002869384 +8,6,0,3.08809466037152 +8,7,0,3.7990024564743 +8,8,0,4.047223982925 +9,1,0,1.66465200951401 +9,2,0,1.64085127184322 +9,3,0,2.52162062529756 +9,4,0,3.21387389366594 +9,5,0,3.45530839871767 +9,6,0,3.94082142799207 +9,7,0,4.44167164418959 +9,8,0,5.15518317344816 +10,1,0,-0.831661359911138 +10,2,0,-0.368357827183266 +10,3,0,0.410072156567392 +10,4,0,0.882103241637359 +10,5,0,0.859414101818633 +10,6,0,1.5394209127834 +10,7,0,2.03673455706977 +10,8,0,2.76663279055195 +11,1,0,2.09325413143569 +11,2,0,2.2395483474461 +11,3,0,2.8601549884189 +11,4,0,3.17665713989311 +11,5,0,3.39847189994413 +11,6,0,4.2263937983117 +11,7,0,4.61159990784862 +11,8,0,5.38247837176413 +12,1,0,-0.289257031210422 +12,2,0,0.179620271573438 +12,3,0,1.16325667887395 +12,4,0,1.29809414474463 +12,5,0,1.93251912824397 +12,6,0,2.31329931268197 +12,7,0,2.57222068621979 +12,8,0,3.81925292367005 +13,1,0,0.976960925165865 +13,2,0,1.71134856408002 +13,3,0,1.9768653993617 +13,4,0,2.46420562075595 +13,5,0,2.76918384264762 +13,6,0,3.49046122766092 +13,7,0,3.88763110383467 +13,8,0,4.53568739049527 +14,1,0,-0.791853272056159 +14,2,0,-0.187296784744338 +14,3,0,0.530416765901547 +14,4,0,0.732470150162659 +14,5,0,1.4532088556853 +14,6,0,1.79656184517501 +14,7,0,2.43385127094151 +14,8,0,3.26466655229001 +15,1,0,-0.583155656781233 +15,2,0,0.262614329354861 +15,3,0,0.72222756208318 +15,4,0,1.09701695155011 +15,5,0,1.76734636682187 +15,6,0,1.90769767564863 +15,7,0,2.49770816748529 +15,8,0,3.50566869748441 +16,1,0,1.51029210211226 +16,2,0,2.21108464745424 +16,3,0,2.39172676176666 +16,4,0,3.03846284438367 +16,5,0,3.57410107529903 +16,6,0,3.63656357697657 +16,7,0,4.35531989841439 +16,8,0,5.34275735958087 +17,1,0,1.59099751867213 +17,2,0,1.65413717936619 +17,3,0,2.81028796905675 +17,4,0,3.18328780802577 +17,5,0,3.60193326786874 +17,6,0,4.30288969049577 +17,7,0,5.09222854841844 +17,8,0,5.24392584530123 +18,1,0,0.807934311011389 +18,2,0,0.919434492380615 +18,3,0,1.36895852651825 +18,4,0,2.02293951095349 +18,5,0,2.15047307163489 +18,6,0,3.24011684506403 +18,7,0,3.93474369140224 +18,8,0,3.92183101868943 +19,1,0,1.32801513099924 +19,2,0,1.96742115703838 +19,3,0,2.50553624597495 +19,4,0,3.1571181882546 +19,5,0,3.67433692462913 +19,6,0,3.95909268282874 +19,7,0,4.56895422521186 +19,8,0,4.85022982238357 +20,1,0,1.24640274552686 +20,2,0,1.82949029027603 +20,3,0,2.06251105183055 +20,4,0,2.6204320278319 +20,5,0,3.34593099675153 +20,6,0,3.89479306001932 +20,7,0,4.12342129630388 +20,8,0,4.56147002880037 +21,1,0,0.860756519278362 +21,2,0,1.2694433258087 +21,3,0,2.08956104088926 +21,4,0,2.69159713038917 +21,5,0,3.23602267039261 +21,6,0,3.59518778140755 +21,7,0,4.23131902599625 +21,8,0,4.57617578349933 +22,1,0,2.43391823145599 +22,2,0,3.29698778720204 +22,3,0,3.89162276187279 +22,4,0,4.14100139568931 +22,5,0,4.83762542995638 +22,6,0,5.50105884329399 +22,7,0,5.5945345328711 +22,8,0,6.14616319307488 +23,1,0,0.820304078069636 +23,2,0,1.23046468721595 +23,3,0,1.50600314831064 +23,4,0,2.25606058438898 +23,5,0,2.53143432820006 +23,6,0,2.59572892973837 +23,7,0,3.77272896159646 +23,8,0,4.22403778880971 +24,1,0,1.30320440910979 +24,2,0,1.57940840743969 +24,3,0,2.41463681558043 +24,4,0,2.56829935718091 +24,5,0,3.49780858741517 +24,6,0,3.72559263134312 +24,7,0,4.18400971215264 +24,8,0,4.70396790016707 +25,1,0,0.722394762356275 +25,2,0,1.3311251714644 +25,3,0,1.68814275470761 +25,4,0,2.36251135873304 +25,5,0,3.00874464026825 +25,6,0,3.37448669083085 +25,7,0,3.69537799006947 +25,8,0,4.16475050532772 +26,1,0,0.485466137435106 +26,2,0,0.955298995135348 +26,3,0,1.3648799323756 +26,4,0,1.98362552002136 +26,5,0,2.28467222598593 +26,6,0,3.01150181904857 +26,7,0,3.68634796548661 +26,8,0,4.24100706573287 +27,1,0,1.45213953627161 +27,2,0,2.51251641366008 +27,3,0,2.52124732803043 +27,4,0,3.18575111662751 +27,5,0,3.40979315749624 +27,6,0,4.16480766882068 +27,7,0,4.9800117454749 +27,8,0,5.16866225517743 +28,1,0,1.47041955448512 +28,2,0,2.08037536287145 +28,3,0,2.21113330902964 +28,4,0,3.05554611460444 +28,5,0,3.49374311745772 +28,6,0,4.1364449581819 +28,7,0,4.29578854745349 +28,8,0,4.97821774925704 +29,1,0,0.262356292405827 +29,2,0,0.505228319934758 +29,3,0,0.648309820648464 +29,4,0,1.77814500276858 +29,5,0,2.06551571971985 +29,6,0,2.80221659444927 +29,7,0,3.26443933955507 +29,8,0,3.84146488548468 +30,1,0,1.35892475350633 +30,2,0,1.62643339657715 +30,3,0,2.23187200983097 +30,4,0,3.00915928829472 +30,5,0,3.40093817799691 +30,6,0,3.77078234223752 +30,7,0,4.07720483163503 +30,8,0,4.70776265031241 +31,1,0,0.514284092131842 +31,2,0,0.929242002262637 +31,3,0,1.65404943247237 +31,4,0,1.92599558635916 +31,5,0,2.50336926100635 +31,6,0,2.80292712881225 +31,7,0,3.22548807917408 +31,8,0,3.89407585007867 +32,1,0,2.1037921374156 +32,2,0,1.9591790558313 +32,3,0,2.75788925585417 +32,4,0,3.42871103233711 +32,5,0,3.51374483193496 +32,6,0,4.16476294003874 +32,7,0,4.73817884229738 +32,8,0,5.35965712639246 +33,1,0,-0.983878865913057 +33,2,0,-0.201733160596227 +33,3,0,0.413200978341906 +33,4,0,0.644485139682032 +33,5,0,1.20737835692523 +33,6,0,1.98446074191797 +33,7,0,2.58941144241347 +33,8,0,2.86083222928688 +34,1,0,0.534940413004529 +34,2,0,0.927812496777354 +34,3,0,1.65556229998545 +34,4,0,1.96977450879971 +34,5,0,2.12481216904322 +34,6,0,2.56992884563598 +34,7,0,3.21788407818519 +34,8,0,4.47795248685649 +35,1,0,0.0704538455930257 +35,2,0,0.586045127464323 +35,3,0,1.28693189219002 +35,4,0,1.45015426885344 +35,5,0,1.86472666254589 +35,6,0,2.6183900614437 +35,7,0,3.1560522061993 +35,8,0,3.44252965913117 +36,1,0,3.12452067281464 +36,2,0,3.45229798099662 +36,3,0,4.29435562137185 +36,4,0,4.26660787900754 +36,5,0,5.0842455742871 +36,6,0,5.67391110796837 +36,7,0,6.0321764957069 +36,8,0,6.9248556835705 +37,1,0,0.248010121268287 +37,2,0,0.218757774433319 +37,3,0,1.29582269643126 +37,4,0,1.10673143478588 +37,5,0,2.0234932808739 +37,6,0,2.61035176377423 +37,7,0,2.89106753234132 +37,8,0,3.20359053825622 +38,1,0,0.834482571899491 +38,2,0,1.4723499908137 +38,3,0,2.1300539482473 +38,4,0,2.09732015937341 +38,5,0,2.92321674683984 +38,6,0,3.09833923540185 +38,7,0,4.08675476446452 +38,8,0,4.24565060260494 +39,1,0,1.44107802593 +39,2,0,1.64489916188785 +39,3,0,1.88851144894041 +39,4,0,2.69471610525367 +39,5,0,3.44232085355739 +39,6,0,3.95954657224985 +39,7,0,4.53142848793589 +39,8,0,4.57597793926637 +40,1,0,0.364647298466558 +40,2,0,0.604784173033909 +40,3,0,1.12422873073565 +40,4,0,1.84301494999195 +40,5,0,2.14882705365019 +40,6,0,2.78891907777397 +40,7,0,3.27795453488485 +40,8,0,3.87275888633489 +41,1,0,0.137967529800507 +41,2,0,0.204201956718598 +41,3,0,0.706650016793307 +41,4,0,1.25929805115561 +41,5,0,1.79895672923494 +41,6,0,2.26316227343556 +41,7,0,2.74737465082776 +41,8,0,3.19656774941229 +42,1,0,2.65694491490471 +42,2,0,3.36021547561185 +42,3,0,3.77649375599502 +42,4,0,4.37832197992893 +42,5,0,4.83162233107423 +42,6,0,5.63097688304612 +42,7,0,5.9035177657234 +42,8,0,6.00244172559591 +43,1,0,1.29459339196085 +43,2,0,2.06229387101955 +43,3,0,2.32393491880218 +43,4,0,2.96976054361446 +43,5,0,3.49981008932945 +43,6,0,3.5922969887643 +43,7,0,4.60474195517341 +43,8,0,4.93145452060186 +44,1,0,0.0537387830451247 +44,2,0,0.315467553812196 +44,3,0,1.04907888135422 +44,4,0,1.98578147714612 +44,5,0,2.243375433765 +44,6,0,2.62367532135965 +44,7,0,2.91980931154629 +44,8,0,3.77728561241702 +45,1,0,-1.51094473126128 +45,2,0,-1.08114119663183 +45,3,0,-0.495719551502474 +45,4,0,0.0807982002853336 +45,5,0,0.719393630593725 +45,6,0,0.955141386401007 +45,7,0,1.60386116012261 +45,8,0,2.08632671800602 +46,1,0,1.25483524625893 +46,2,0,1.52349388053221 +46,3,0,2.07864012836276 +46,4,0,2.42760483592431 +46,5,0,3.36163054876707 +46,6,0,3.39605200020633 +46,7,0,4.55456273987458 +46,8,0,4.71915089881398 +47,1,0,0.237988571102958 +47,2,0,0.508867690357799 +47,3,0,1.35934282325916 +47,4,0,1.29053284350802 +47,5,0,2.2285081029067 +47,6,0,2.85054722723214 +47,7,0,3.48715870521065 +47,8,0,3.85639684798515 +48,1,0,1.29755155236581 +48,2,0,2.17913342070303 +48,3,0,2.6784298272036 +48,4,0,2.6548752237907 +48,5,0,3.30206584127549 +48,6,0,3.86254626518904 +48,7,0,4.47816735226851 +48,8,0,4.85765399592985 +49,1,0,2.01909707704745 +49,2,0,2.47308897703961 +49,3,0,3.17407097381774 +49,4,0,3.48203011921528 +49,5,0,3.81709557266252 +49,6,0,4.45747821086635 +49,7,0,5.09981572791837 +49,8,0,5.4802102218025 +50,1,0,-0.448837895689941 +50,2,0,-0.298643802901852 +50,3,0,-0.080419715770935 +50,4,0,1.02593965853912 +50,5,0,1.56524071083903 +50,6,0,1.96652476356899 +50,7,0,2.52012227980579 +50,8,0,2.63075998602788 +51,1,0,1.86181485827551 +51,2,0,2.35816003559163 +51,3,0,2.83862141095108 +51,4,0,3.46711621495583 +51,5,0,3.41682084283998 +51,6,0,4.22243929403655 +51,7,0,4.75847326488305 +51,8,0,5.49054291782366 +52,1,0,2.23352514687859 +52,2,0,3.00549552536479 +52,3,0,3.30909077188148 +52,4,0,3.78770896749406 +52,5,0,4.34257127024374 +52,6,0,4.80766785701701 +52,7,0,5.34334476016836 +52,8,0,5.83860145420842 +53,1,0,-0.344559747867141 +53,2,0,0.395564116331045 +53,3,0,0.825027952693013 +53,4,0,1.14369923051778 +53,5,0,1.62492672907177 +53,6,0,2.3809402166797 +53,7,0,2.58561529765697 +53,8,0,3.33341961530725 +54,1,0,0.314539815534906 +54,2,0,0.669260561259938 +54,3,0,1.46722345622402 +54,4,0,1.76349563660827 +54,5,0,2.24079497205936 +54,6,0,2.84479884210006 +54,7,0,3.22654888876197 +54,8,0,3.59724632826831 +55,1,0,0.525365760401837 +55,2,0,1.3340038530387 +55,3,0,1.34050524765035 +55,4,0,2.25559773033464 +55,5,0,2.79970276099734 +55,6,0,2.99841464625356 +55,7,0,3.80445972387623 +55,8,0,4.14782592813191 +56,1,0,2.56790198143581 +56,2,0,2.81550010461755 +56,3,0,4.08671392398119 +56,4,0,4.21723021006283 +56,5,0,4.51843018970835 +56,6,0,5.24797068517281 +56,7,0,5.82256153956169 +56,8,0,6.04644022917596 +57,1,0,1.27608262164854 +57,2,0,1.72002062777107 +57,3,0,2.25420838813264 +57,4,0,2.81222438525511 +57,5,0,3.34292214381018 +57,6,0,3.46463499145415 +57,7,0,4.22766560217851 +57,8,0,4.79839845669278 +58,1,0,-0.658053842903073 +58,2,0,0.160768557746596 +58,3,0,1.07139341593188 +58,4,0,1.00688918370808 +58,5,0,1.89184774305466 +58,6,0,2.38059852440773 +58,7,0,2.80089686759589 +58,8,0,3.06344299321046 +59,1,0,-0.70984085235411 +59,2,0,-0.0343084470278902 +59,3,0,0.385499554590131 +59,4,0,1.24525769117722 +59,5,0,1.31020609466661 +59,6,0,1.67159139528118 +59,7,0,1.97309584979105 +59,8,0,2.87245086682389 +60,1,3,-0.719268682234314 +60,2,3,-0.0560185689875659 +60,3,3,1.16254141100696 +60,4,3,2.29921974917942 +60,5,3,3.69311261584687 +60,6,3,4.51382655685811 +60,7,3,5.24160398532174 +60,8,3,6.30226657904429 +61,1,3,0.677801658691706 +61,2,3,1.11477775412072 +61,3,3,2.62091150194693 +61,4,3,3.49229117282332 +61,5,3,4.49100206304702 +61,6,3,5.58189534770296 +61,7,3,6.19386990222223 +61,8,3,7.83930335657842 +62,1,3,0.305561330615861 +62,2,3,0.970588062449194 +62,3,3,2.59487901961403 +62,4,3,3.0754979630223 +62,5,3,4.18157239389626 +62,6,3,4.86817117141176 +62,7,3,6.25367435797187 +62,8,3,7.18548577800311 +63,1,3,1.05824399624178 +63,2,3,1.15297460223181 +63,3,3,2.8240327290642 +63,4,3,3.83303468974252 +63,5,3,5.18592435458602 +63,6,3,5.72911534903804 +63,7,3,7.18861430423524 +63,8,3,8.33039959696296 +64,1,3,-0.101927785431208 +64,2,3,0.660565642440033 +64,3,3,1.64434385128881 +64,4,3,2.68789348574499 +64,5,3,4.09861609518632 +64,6,3,4.56080665651255 +64,7,3,5.75303580684061 +64,8,3,7.09034460439771 +65,1,3,0.701189036899391 +65,2,3,0.979286949554214 +65,3,3,3.03584391610724 +65,4,3,3.97128878578281 +65,5,3,4.67602103277406 +65,6,3,5.42694651882043 +65,7,3,6.40152363172409 +65,8,3,7.7702764531453 +66,1,3,0.160452702464162 +66,2,3,0.656226548790744 +66,3,3,2.52096178512908 +66,4,3,3.32607449371447 +66,5,3,4.16196429261208 +66,6,3,4.70692724161365 +66,7,3,6.48121105900816 +66,8,3,7.34421580480872 +67,1,3,0.673442654651696 +67,2,3,0.962843749444522 +67,3,3,2.60775819037386 +67,4,3,3.21087811843561 +67,5,3,4.61287762004842 +67,6,3,5.53125653664404 +67,7,3,6.38192420044873 +67,8,3,7.86926085559182 +68,1,3,0.166620263373829 +68,2,3,0.443630391490223 +68,3,3,1.6747810326122 +68,4,3,3.02439782857839 +68,5,3,4.30195869883444 +68,6,3,4.857825046864 +68,7,3,5.87638580951173 +68,8,3,7.01097500867526 +69,1,3,-0.641779152999048 +69,2,3,-0.221423278943882 +69,3,3,1.38658267743202 +69,4,3,2.60253269489247 +69,5,3,3.37017948890763 +69,6,3,4.19563369315284 +69,7,3,5.39454581410649 +69,8,3,6.39442601137506 +70,1,3,-1.07865231004769 +70,2,3,-0.715077800569542 +70,3,3,1.05579627565737 +70,4,3,1.96080227310335 +70,5,3,3.36852477828328 +70,6,3,3.90080254850334 +70,7,3,5.26167711925935 +70,8,3,6.15880759001324 +71,1,3,1.75033664102872 +71,2,3,2.30642048048951 +71,3,3,4.04696319603754 +71,4,3,5.05916472988662 +71,5,3,5.94546718107763 +71,6,3,6.6467891244918 +71,7,3,7.71910730541043 +71,8,3,9.30854464461094 +72,1,3,-0.400589123970168 +72,2,3,0.704652518789466 +72,3,3,1.84079466718125 +72,4,3,2.92576769505844 +72,5,3,3.95249416878941 +72,6,3,5.11855961618223 +72,7,3,6.22523006134833 +72,8,3,6.87750866417755 +73,1,3,0.147520541623632 +73,2,3,0.690283853463692 +73,3,3,2.30283363578552 +73,4,3,3.07028262117547 +73,5,3,4.21465755496875 +73,6,3,5.35526948570922 +73,7,3,5.84832192155986 +73,8,3,7.28878335074962 +74,1,3,-0.0478317593398545 +74,2,3,0.268926163269274 +74,3,3,1.70095356686444 +74,4,3,2.88125406809142 +74,5,3,3.75440375750523 +74,6,3,4.72843651831305 +74,7,3,5.62862501563621 +74,8,3,6.69591708668679 +75,1,3,-0.405809940825969 +75,2,3,0.0900305887133675 +75,3,3,1.7184987102796 +75,4,3,2.32097672798748 +75,5,3,3.57509620039659 +75,6,3,4.70212505734094 +75,7,3,5.66087902239169 +75,8,3,7.18631191136391 +76,1,3,1.02167361969083 +76,2,3,1.69794074051215 +76,3,3,3.07623542965749 +76,4,3,3.95508721058957 +76,5,3,4.83444155561634 +76,6,3,6.00071282815628 +76,7,3,7.01811712970264 +76,8,3,8.23370032057519 +77,1,3,1.64576727904779 +77,2,3,1.82928182967484 +77,3,3,3.41017195105545 +77,4,3,4.82781016841801 +77,5,3,5.84019216911828 +77,6,3,6.17451297287111 +77,7,3,7.29827289804374 +77,8,3,8.72409327068498 +78,1,3,0.133857463187289 +78,2,3,0.725497407732272 +78,3,3,2.12147131221335 +78,4,3,3.32458703624904 +78,5,3,4.21752200613813 +78,6,3,5.43826635440585 +78,7,3,6.44931666372289 +78,8,3,7.22951162055743 +79,1,3,0.0664085065342783 +79,2,3,0.612877586450337 +79,3,3,2.36715889678118 +79,4,3,2.96946782437747 +79,5,3,4.00410947355307 +79,6,3,5.40236491055802 +79,7,3,6.14988628888044 +79,8,3,7.0871738675892 +80,1,3,1.13367864223052 +80,2,3,1.7411622309359 +80,3,3,3.35438805520481 +80,4,3,4.17766379974428 +80,5,3,5.55375054425833 +80,6,3,6.52162478755372 +80,7,3,7.57538743044453 +80,8,3,8.53532311496596 +81,1,3,0.164198356324343 +81,2,3,1.08220392452701 +81,3,3,2.10066530441217 +81,4,3,3.39049462694941 +81,5,3,4.0423463256487 +81,6,3,5.29608644567218 +81,7,3,6.5846947763314 +81,8,3,7.45171907268693 +82,1,3,1.95292002267312 +82,2,3,2.66804235065524 +82,3,3,3.93328273263077 +82,4,3,5.36347054378852 +82,5,3,6.06683860839213 +82,6,3,6.94301654777259 +82,7,3,8.21118877718264 +82,8,3,9.15614741092001 +83,1,3,0.371526712127935 +83,2,3,0.581254274680354 +83,3,3,2.1198225582618 +83,4,3,3.29057531333062 +83,5,3,4.54895821309303 +83,6,3,5.15700625028827 +83,7,3,6.38443767309559 +83,8,3,7.08306530026089 +84,1,3,0.434283865463328 +84,2,3,0.662768371723026 +84,3,3,2.87459496588888 +84,4,3,3.20320613575708 +84,5,3,4.37130674088997 +84,6,3,5.03720949577778 +84,7,3,6.3607043867553 +84,8,3,7.01369330906244 +85,1,3,-0.367309155119938 +85,2,3,0.259182324825179 +85,3,3,1.12220553893189 +85,4,3,2.65130106442621 +85,5,3,3.60964708718875 +85,6,3,4.85380686871999 +85,7,3,5.62351296773631 +85,8,3,6.70124838920003 +86,1,3,0.319974969587851 +86,2,3,0.406142972357535 +86,3,3,1.926672712444 +86,4,3,3.24662671689068 +86,5,3,4.28581787269771 +86,6,3,5.17423004530816 +86,7,3,6.4216854231382 +86,8,3,7.11098904575663 +87,1,3,-0.0104903757766827 +87,2,3,0.248276263315795 +87,3,3,1.60551697625337 +87,4,3,2.93540731541587 +87,5,3,4.0463209954738 +87,6,3,4.87959240162977 +87,7,3,6.05741196274632 +87,8,3,6.78595609626293 +88,1,3,2.1566158885518 +88,2,3,2.37700426614069 +88,3,3,4.67890547088398 +88,4,3,5.26815327424333 +88,5,3,6.29140120414791 +88,6,3,6.89251845648695 +88,7,3,8.29889197576343 +88,8,3,9.02620478742367 +89,1,3,0.522823311622717 +89,2,3,0.720894021704307 +89,3,3,2.0481538775907 +89,4,3,3.53706437353728 +89,5,3,4.08516480881842 +89,6,3,5.19435218107831 +89,7,3,6.35222386583984 +89,8,3,7.58938749114136 +90,1,3,0.110478214499304 +90,2,3,0.361221016034652 +90,3,3,1.69191466379943 +90,4,3,2.93434401191667 +90,5,3,3.88750436417871 +90,6,3,5.01369680784638 +90,7,3,5.70314936370187 +90,8,3,6.77412158237975 +91,1,3,0.158076458872298 +91,2,3,0.748418876833585 +91,3,3,2.33519900265607 +91,4,3,3.47993652356401 +91,5,3,4.22316386563191 +91,6,3,5.59531669381371 +91,7,3,6.29475494423444 +91,8,3,7.31018232042381 +92,1,3,-0.845667721672671 +92,2,3,-0.0952237476902149 +92,3,3,1.09449929433509 +92,4,3,2.32697635305069 +92,5,3,3.53162160382455 +92,6,3,4.18412212993421 +92,7,3,5.43072655030752 +92,8,3,6.2528718321991 +93,1,3,-0.554396371599341 +93,2,3,-0.487548471136003 +93,3,3,0.872855049255904 +93,4,3,2.62512632052788 +93,5,3,3.63237392425111 +93,6,3,3.8332724417017 +93,7,3,5.38698788925701 +93,8,3,6.51892807410421 +94,1,3,1.74353621268684 +94,2,3,2.62240125056578 +94,3,3,3.4658937291633 +94,4,3,4.71292088569413 +94,5,3,5.65289936064052 +94,6,3,6.97774155638962 +94,7,3,7.5748409825114 +94,8,3,8.32376512209955 +95,1,3,1.24501550075218 +95,2,3,1.27936709974271 +95,3,3,2.86647376653648 +95,4,3,3.84154660633742 +95,5,3,4.71191176207907 +95,6,3,5.65740174075242 +95,7,3,6.81485607094101 +95,8,3,7.98414735973719 +96,1,3,0.835625442247262 +96,2,3,1.63524961836139 +96,3,3,2.88242212021634 +96,4,3,3.75660238927897 +96,5,3,4.91209469035123 +96,6,3,6.12390731013328 +96,7,3,7.04894242300733 +96,8,3,7.86466760806126 +97,1,3,-0.0227065616852005 +97,2,3,0.570270933257905 +97,3,3,1.53086916477496 +97,4,3,2.70270441960765 +97,5,3,3.65556887712201 +97,6,3,4.52944799962726 +97,7,3,5.81665674179551 +97,8,3,6.26737176243005 +98,1,3,0.1608080379801 +98,2,3,0.746437314670479 +98,3,3,2.23111744027251 +98,4,3,3.2078673406281 +98,5,3,4.40130516506188 +98,6,3,5.24395988270281 +98,7,3,6.30157721343787 +98,8,3,7.03498514715133 +99,1,3,1.8412747626686 +99,2,3,2.24192297245743 +99,3,3,4.14772668698801 +99,4,3,4.57029118484342 +99,5,3,5.97678772699363 +99,6,3,7.40705263000443 +99,7,3,8.20310972268435 +99,8,3,8.95583458420776 +100,1,3,0.853603661254639 +100,2,3,1.15557667523494 +100,3,3,3.01447867943591 +100,4,3,4.17018419684842 +100,5,3,4.97210964256587 +100,6,3,5.83965101740009 +100,7,3,6.92479587369302 +100,8,3,7.97958531088616 +101,1,3,2.39580725217959 +101,2,3,2.86449727737082 +101,3,3,4.17312547480334 +101,4,3,5.53806752827575 +101,5,3,6.39501064043096 +101,6,3,7.26811215000094 +101,7,3,8.53649514251206 +101,8,3,9.45485039379188 +102,1,3,0.595676431885384 +102,2,3,1.05348373913475 +102,3,3,2.63582181740774 +102,4,3,3.7167673944854 +102,5,3,4.52909719292877 +102,6,3,5.41266353430887 +102,7,3,6.43924226913368 +102,8,3,7.18526772340207 +103,1,3,1.5107003697783 +103,2,3,2.12260957942516 +103,3,3,3.18644365578075 +103,4,3,4.53918387496874 +103,5,3,5.06341455602335 +103,6,3,6.49283917995712 +103,7,3,7.75323395581945 +103,8,3,8.35667316156763 +104,1,3,1.88866758741314 +104,2,3,2.18231899894117 +104,3,3,3.71554884855156 +104,4,3,4.68146750843871 +104,5,3,6.10361571530113 +104,6,3,6.67083431262761 +104,7,3,7.7800345307164 +104,8,3,8.92777893266183 +105,1,3,1.73762508172132 +105,2,3,2.32733246243909 +105,3,3,3.55460759166834 +105,4,3,4.86889777317026 +105,5,3,6.08193842927183 +105,6,3,7.04148805029592 +105,7,3,7.35974106864338 +105,8,3,8.86588126079279 +106,1,3,-0.537630215045871 +106,2,3,0.353196144159918 +106,3,3,1.74238421667739 +106,4,3,2.7909111088959 +106,5,3,3.97419562263608 +106,6,3,4.98873067434378 +106,7,3,5.61659205454897 +106,8,3,6.92400449203583 +107,1,3,-1.51336148966568 +107,2,3,-0.987771702119031 +107,3,3,0.510206440394591 +107,4,3,1.65263967956168 +107,5,3,2.2231983838614 +107,6,3,3.25914843042085 +107,7,3,4.77817247624947 +107,8,3,5.46718795872109 +108,1,3,-0.621733240381946 +108,2,3,0.0353380174099048 +108,3,3,1.476436324534 +108,4,3,2.4821327383924 +108,5,3,3.66505516491496 +108,6,3,4.26493239579394 +108,7,3,5.03385050201781 +108,8,3,6.15505561370686 +109,1,3,0.65698658036601 +109,2,3,0.879559369971633 +109,3,3,2.45640402745839 +109,4,3,3.78941510679771 +109,5,3,4.73263239779589 +109,6,3,5.33156639906988 +109,7,3,6.54950383050795 +109,8,3,7.66046552316943 +110,1,3,-0.278290439134022 +110,2,3,-0.00531568792845563 +110,3,3,1.32161966941723 +110,4,3,2.67616444547022 +110,5,3,3.77072238532519 +110,6,3,4.16247631364279 +110,7,3,5.47110653208128 +110,8,3,6.27650112074593 +111,1,3,1.26035159610515 +111,2,3,1.67081325682447 +111,3,3,3.09018449245164 +111,4,3,4.19629820774435 +111,5,3,5.23300416518958 +111,6,3,6.12336713464138 +111,7,3,7.09543733662832 +111,8,3,7.9910425192222 +112,1,3,0.768782327309874 +112,2,3,1.49434073539369 +112,3,3,3.0220678108492 +112,4,3,3.67076347692299 +112,5,3,4.84071776365439 +112,6,3,5.42711417981796 +112,7,3,6.74530298073304 +112,8,3,8.05337576385035 +113,1,3,-0.512842567222705 +113,2,3,0.333803710315301 +113,3,3,1.77579958118868 +113,4,3,2.70429446040193 +113,5,3,3.44439599189216 +113,6,3,4.74446930915482 +113,7,3,5.78131060905661 +113,8,3,6.75110414102881 +114,1,3,-0.185357563549793 +114,2,3,0.304278260357402 +114,3,3,1.40795238760697 +114,4,3,2.3141226921558 +114,5,3,3.26458761204062 +114,6,3,4.53496954953977 +114,7,3,5.24542510625861 +114,8,3,6.2486231935731 +115,1,3,0.244762750632874 +115,2,3,0.182902600080998 +115,3,3,2.15982418169374 +115,4,3,2.93209076379569 +115,5,3,3.56306414845394 +115,6,3,5.15462368228077 +115,7,3,5.96139690314987 +115,8,3,6.87629331865792 +116,1,3,0.899813559307228 +116,2,3,1.29836655405475 +116,3,3,2.43105086510445 +116,4,3,3.3601921105621 +116,5,3,4.60638982728935 +116,6,3,5.80556147441881 +116,7,3,6.73591475723964 +116,8,3,7.67421963338842 +117,1,3,0.670377457370441 +117,2,3,0.655388579586071 +117,3,3,2.32780928316695 +117,4,3,3.08664711993556 +117,5,3,4.36954660417253 +117,6,3,5.42122299400728 +117,7,3,6.26240238609592 +117,8,3,7.20523579801548 +118,1,3,-0.821146989568458 +118,2,3,-0.396815103673733 +118,3,3,1.65289439528466 +118,4,3,2.4546710056127 +118,5,3,3.56108791933014 +118,6,3,4.48732065309802 +118,7,3,5.81742309222639 +118,8,3,6.16741022457937 +119,1,3,0.2512503215211 +119,2,3,0.609118567341081 +119,3,3,2.2904153273313 +119,4,3,3.16269752131974 +119,5,3,3.98072316436875 +119,6,3,5.02143044194653 +119,7,3,6.16454816370158 +119,8,3,6.85345983286224 +120,1,5,2.07355769991514 +120,2,5,2.70176141101685 +120,3,5,3.07532887658374 +120,4,5,3.54183584355425 +120,5,5,5.35574267530824 +120,6,5,5.94031247762939 +120,7,5,6.998965831818 +120,8,5,8.25391424022031 +121,1,5,0.545705645898663 +121,2,5,1.59458162025258 +121,3,5,1.77632380720174 +121,4,5,2.39303998153606 +121,5,5,3.69611036335378 +121,6,5,4.78728498785316 +121,7,5,5.77511141255148 +121,8,5,6.62925759767466 +122,1,5,0.348648056262885 +122,2,5,1.32497888174757 +122,3,5,1.7792456248239 +122,4,5,2.06883846870991 +122,5,5,3.55762813412755 +122,6,5,4.61186449320064 +122,7,5,5.94469339723567 +122,8,5,7.22402603967549 +123,1,5,0.11473014330258 +123,2,5,0.511124556261185 +123,3,5,1.14695019214237 +123,4,5,1.619003464941 +123,5,5,2.98883880634006 +123,6,5,4.318828002304 +123,7,5,5.04118314730071 +123,8,5,5.89015258629618 +124,1,5,0.353026663542674 +124,2,5,0.463146009094651 +124,3,5,1.03337658770784 +124,4,5,1.59307067258282 +124,5,5,3.02330619421151 +124,6,5,4.03816633979783 +124,7,5,5.0128772757355 +124,8,5,6.354336326349 +125,1,5,1.30759376251991 +125,2,5,1.43063736991179 +125,3,5,2.40942319627773 +125,4,5,2.52882564103711 +125,5,5,4.17879800486291 +125,6,5,5.07127883112598 +125,7,5,6.16587286371303 +125,8,5,7.33375556255224 +126,1,5,-0.0531641770028999 +126,2,5,0.164519460371381 +126,3,5,0.55992507886714 +126,4,5,1.30926892866658 +126,5,5,2.66056927655847 +126,6,5,3.59413029414135 +126,7,5,4.45258367453011 +126,8,5,5.75588726052884 +127,1,5,2.14707661024169 +127,2,5,2.7033355439571 +127,3,5,3.7305747086312 +127,4,5,4.03212587485596 +127,5,5,5.17105936912862 +127,6,5,6.50471451427771 +127,7,5,7.38290008652188 +127,8,5,8.61332661567901 +128,1,5,0.347949394487828 +128,2,5,0.515133037384221 +128,3,5,0.946661617926291 +128,4,5,1.5863912063442 +128,5,5,3.07486872976229 +128,6,5,4.00854958247365 +128,7,5,4.95533265684274 +128,8,5,6.13097781768475 +129,1,5,1.15350266866743 +129,2,5,1.67385258352653 +129,3,5,1.90531572360025 +129,4,5,2.64547351304772 +129,5,5,3.65116556797884 +129,6,5,5.12513430486377 +129,7,5,6.24326209088698 +129,8,5,7.39585785944324 +130,1,5,2.00368331013529 +130,2,5,2.31096321570056 +130,3,5,3.14897486683582 +130,4,5,3.60000356149414 +130,5,5,5.02521231648775 +130,6,5,5.89867609327909 +130,7,5,7.12214412591731 +130,8,5,8.08117262469428 +131,1,5,-0.160822734391134 +131,2,5,0.722303945240265 +131,3,5,0.697364647519555 +131,4,5,1.1385294912458 +131,5,5,2.83950274529066 +131,6,5,3.97833391764805 +131,7,5,4.5028962883293 +131,8,5,6.02560936653114 +132,1,5,0.579307594077414 +132,2,5,1.28304853736308 +132,3,5,1.5125162008707 +132,4,5,2.31460690732269 +132,5,5,3.7141364128688 +132,6,5,4.8465635471783 +132,7,5,5.57115008644628 +132,8,5,6.55282272317461 +133,1,5,-0.852547031973224 +133,2,5,-0.162648504898197 +133,3,5,0.125175541718792 +133,4,5,1.00399240977452 +133,5,5,2.02288849499148 +133,6,5,3.55731287964353 +133,7,5,4.15789446972334 +133,8,5,5.26073588877272 +134,1,5,-0.490900026742535 +134,2,5,0.118353023295689 +134,3,5,0.590971956082911 +134,4,5,1.08995593062194 +134,5,5,2.62672336169321 +134,6,5,3.38336882640513 +134,7,5,4.47244614261959 +134,8,5,5.99903342583879 +135,1,5,1.12327735834762 +135,2,5,1.37526813862002 +135,3,5,1.67058112798098 +135,4,5,2.23269845844678 +135,5,5,4.38053720522963 +135,6,5,5.0119754398454 +135,7,5,5.75715487827219 +135,8,5,6.87080838229093 +136,1,5,0.641149283406735 +136,2,5,0.833649121738602 +136,3,5,1.34980559006897 +136,4,5,1.76155690834355 +136,5,5,3.35038151623076 +136,6,5,4.41802895727517 +136,7,5,5.31236990270441 +136,8,5,6.26817268927078 +137,1,5,0.696355760662733 +137,2,5,1.00600830150395 +137,3,5,1.60557592949137 +137,4,5,2.03956174255656 +137,5,5,3.52003551528159 +137,6,5,4.37233094089374 +137,7,5,6.02506245389463 +137,8,5,6.76338217231818 +138,1,5,1.48044969692376 +138,2,5,1.69156360234213 +138,3,5,2.3230020268256 +138,4,5,2.60046145396267 +138,5,5,3.99689834307206 +138,6,5,5.34886941673575 +138,7,5,6.40194897896813 +138,8,5,7.43356794691465 +139,1,5,1.12132230347744 +139,2,5,2.04919566667762 +139,3,5,2.37066285188843 +139,4,5,3.04517876839378 +139,5,5,4.80960943033978 +139,6,5,5.52162367612219 +139,7,5,6.60265929863807 +139,8,5,7.50480641176629 +140,1,5,-0.796095474230791 +140,2,5,-0.452579279400348 +140,3,5,0.00947453753975691 +140,4,5,0.833452724844031 +140,5,5,2.57661594087727 +140,6,5,3.25171966543205 +140,7,5,4.21904639603497 +140,8,5,5.27228467034679 +141,1,5,1.44585179068538 +141,2,5,2.08873295069904 +141,3,5,2.9477320311015 +141,4,5,3.16217368658878 +141,5,5,4.47270459846095 +141,6,5,5.73960305079448 +141,7,5,6.40373672551532 +141,8,5,7.55728689115746 +142,1,5,-0.419083902232799 +142,2,5,0.234429057049568 +142,3,5,0.545730486376036 +142,4,5,1.5344252444695 +142,5,5,2.65952327092003 +142,6,5,4.24797783021148 +142,7,5,5.06698135435331 +142,8,5,6.20296172042246 +143,1,5,-1.5759748924097 +143,2,5,-0.86876312251632 +143,3,5,-0.490970695678289 +143,4,5,-0.490913878274664 +143,5,5,1.63376523536751 +143,6,5,2.29734560926664 +143,7,5,3.58238286054971 +143,8,5,4.91327085824494 +144,1,5,0.811978814671721 +144,2,5,1.65001069738936 +144,3,5,1.92530050386947 +144,4,5,2.73902511393377 +144,5,5,4.25416472998866 +144,6,5,4.98233651146248 +144,7,5,6.50313076575639 +144,8,5,7.13923362603824 +145,1,5,1.42018832659402 +145,2,5,2.12203019512399 +145,3,5,2.47549419271258 +145,4,5,2.79695982344628 +145,5,5,4.31985710330443 +145,6,5,5.20566778145388 +145,7,5,6.50710807808323 +145,8,5,7.25253764628188 +146,1,5,0.467187816876626 +146,2,5,0.776203327379875 +146,3,5,1.58690282822873 +146,4,5,1.9252279760618 +146,5,5,3.17601713465207 +146,6,5,4.50551046887625 +146,7,5,5.58620593388508 +146,8,5,6.26271631236738 +147,1,5,0.179163589204672 +147,2,5,0.793213070348187 +147,3,5,1.62995570591685 +147,4,5,1.70516418639264 +147,5,5,3.48446303566426 +147,6,5,4.66418521855636 +147,7,5,5.41086269876853 +147,8,5,6.39593748221464 +148,1,5,-0.329721641230399 +148,2,5,-0.104604986399989 +148,3,5,0.557117626678429 +148,4,5,1.1283800254646 +148,5,5,2.49850643335396 +148,6,5,3.49652631627393 +148,7,5,4.83117268302154 +148,8,5,5.8778316560307 +149,1,5,2.83546914301537 +149,2,5,3.55942504509601 +149,3,5,4.0343478480526 +149,4,5,4.29036331941152 +149,5,5,5.84267818942925 +149,6,5,6.95988152118232 +149,7,5,7.84371289141104 +149,8,5,8.99504393820711 +150,1,5,0.353558327909409 +150,2,5,0.0971312003621931 +150,3,5,0.999285213587224 +150,4,5,1.32465699506173 +150,5,5,3.04946029363458 +150,6,5,4.2541166378007 +150,7,5,5.09876372344629 +150,8,5,5.71906199696452 +151,1,5,0.525750886451906 +151,2,5,0.893396180499992 +151,3,5,2.07896509350654 +151,4,5,2.25867109481025 +151,5,5,3.72877107308203 +151,6,5,4.56557003998835 +151,7,5,5.74034210162108 +151,8,5,6.66484800119341 +152,1,5,-0.959359779342574 +152,2,5,-0.315249756011789 +152,3,5,0.0501839424136046 +152,4,5,1.04222089222118 +152,5,5,1.99002316333137 +152,6,5,3.05414518371304 +152,7,5,4.13750681004899 +152,8,5,5.17931328304503 +153,1,5,-0.475784459340995 +153,2,5,0.038042614793213 +153,3,5,0.384272767841674 +153,4,5,1.22364423384171 +153,5,5,2.44543176058119 +153,6,5,3.41636201138061 +153,7,5,4.35074939454188 +153,8,5,5.49402389634794 +154,1,5,-0.479392940089231 +154,2,5,0.105242923746075 +154,3,5,0.665810253622011 +154,4,5,1.19760855819624 +154,5,5,2.83587415784842 +154,6,5,3.90803611453573 +154,7,5,4.64589043122806 +154,8,5,5.73795478250453 +155,1,5,0.608317423060096 +155,2,5,1.40380927513764 +155,3,5,1.75866339340061 +155,4,5,2.62202958412669 +155,5,5,3.77419704141051 +155,6,5,5.03265720708436 +155,7,5,5.7044836778403 +155,8,5,6.33816195552479 +156,1,5,1.86431961335335 +156,2,5,1.82330603084261 +156,3,5,2.2285891986006 +156,4,5,2.95204440516718 +156,5,5,4.38377615569592 +156,6,5,6.03714325749645 +156,7,5,6.57073648844073 +156,8,5,7.47937160050042 +157,1,5,1.99840370192053 +157,2,5,2.59121177630995 +157,3,5,3.15671154350392 +157,4,5,3.42920370304041 +157,5,5,5.0599439931328 +157,6,5,6.143723692161 +157,7,5,7.1165003634696 +157,8,5,8.2525571950614 +158,1,5,1.03319511937132 +158,2,5,1.32039470544842 +158,3,5,2.11133018212242 +158,4,5,2.66975571142703 +158,5,5,3.96141968710621 +158,6,5,4.75942104932975 +158,7,5,5.95429631606491 +158,8,5,6.92766633257145 +159,1,5,1.03890300454548 +159,2,5,1.222165245985 +159,3,5,2.41987659803987 +159,4,5,2.23505378240698 +159,5,5,4.01387523949909 +159,6,5,5.27211433799155 +159,7,5,5.69558088483459 +159,8,5,7.06135032337447 +160,1,5,1.26288899611135 +160,2,5,1.54092617683566 +160,3,5,2.17184761709794 +160,4,5,3.01475506628533 +160,5,5,4.2263374271918 +160,6,5,5.25666266948215 +160,7,5,6.3595385670447 +160,8,5,7.50241057062099 +161,1,5,0.505126103765275 +161,2,5,0.562896294462252 +161,3,5,1.31207599745235 +161,4,5,2.20151827539788 +161,5,5,3.29283903880832 +161,6,5,4.34411537759276 +161,7,5,5.28441267859029 +161,8,5,6.70739180559209 +162,1,5,-0.0758061262284153 +162,2,5,0.407051283873834 +162,3,5,0.893792739036451 +162,4,5,1.56889992048944 +162,5,5,3.46568416712827 +162,6,5,4.17380092831167 +162,7,5,4.92813412215899 +162,8,5,6.02311921589609 +163,1,5,-0.104307312353201 +163,2,5,0.568610824001807 +163,3,5,0.725597092563344 +163,4,5,0.99041532712622 +163,5,5,2.82907626157414 +163,6,5,3.81925025136067 +163,7,5,4.83817321685703 +163,8,5,6.0862540373602 +164,1,5,1.25321535697964 +164,2,5,2.03833270786335 +164,3,5,2.44594553570962 +164,4,5,2.70600596482496 +164,5,5,4.6855826520896 +164,6,5,5.37092292764573 +164,7,5,6.19475318486653 +164,8,5,7.22234055091044 +165,1,5,1.00664552922359 +165,2,5,1.7171680741074 +165,3,5,2.06937727670354 +165,4,5,2.97730237529092 +165,5,5,4.32308381364884 +165,6,5,5.71856864449326 +165,7,5,6.24525117654903 +165,8,5,7.2052447929068 +166,1,5,-1.4718533836255 +166,2,5,-0.588772870941195 +166,3,5,-0.0624960662781316 +166,4,5,0.491697585570406 +166,5,5,1.69920267162904 +166,6,5,3.37409071109295 +166,7,5,3.83024230236719 +166,8,5,5.00951530150485 +167,1,5,-0.607864283484989 +167,2,5,-0.0418177192546643 +167,3,5,0.698535572057312 +167,4,5,0.901324379263428 +167,5,5,2.79178684877704 +167,6,5,3.72835771969893 +167,7,5,4.54282199057793 +167,8,5,5.72948476464321 +168,1,5,0.200958886180729 +168,2,5,0.802340917781474 +168,3,5,1.27035103196685 +168,4,5,2.19711914880414 +168,5,5,3.23068618552698 +168,6,5,4.46970917688263 +168,7,5,5.42317054951253 +168,8,5,6.50804555661555 +169,1,5,0.0674731172700116 +169,2,5,0.507237516662246 +169,3,5,0.108760678450507 +169,4,5,1.2162416343858 +169,5,5,2.9517080409866 +169,6,5,3.94070623679996 +169,7,5,4.73529607890144 +169,8,5,5.64402582861047 +170,1,5,0.323097510356277 +170,2,5,0.899917182906831 +170,3,5,1.41380536838571 +170,4,5,2.36254857153997 +170,5,5,3.66957487642263 +170,6,5,4.64478093842395 +170,7,5,5.49150398374669 +170,8,5,6.30597612053104 +171,1,5,0.257146298326122 +171,2,5,0.807139154449787 +171,3,5,1.51490378109914 +171,4,5,2.44162215343575 +171,5,5,3.34879387663644 +171,6,5,4.56535689058725 +171,7,5,5.08392715840072 +171,8,5,6.64577434826485 +172,1,5,1.07442579743174 +172,2,5,1.11713008367254 +172,3,5,1.84807148756904 +172,4,5,2.22664278974788 +172,5,5,3.5125714070689 +172,6,5,4.46591009194485 +172,7,5,5.67428613083647 +172,8,5,6.5901028045963 +173,1,5,1.11353773046385 +173,2,5,1.39859094841482 +173,3,5,1.83383044657518 +173,4,5,2.50954782806781 +173,5,5,4.24807100347722 +173,6,5,4.93629127979168 +173,7,5,6.0126683755917 +173,8,5,7.12687807353198 +174,1,5,0.882227861333195 +174,2,5,1.50660300113403 +174,3,5,1.51249412161028 +174,4,5,2.53017611493252 +174,5,5,3.88136850994042 +174,6,5,5.02833646153472 +174,7,5,5.92625459663981 +174,8,5,6.95399935432011 +175,1,5,-0.0970396494888282 +175,2,5,0.226704462041368 +175,3,5,0.33832107775446 +175,4,5,1.14377884803933 +175,5,5,3.00187534622307 +175,6,5,3.51757115304584 +175,7,5,4.69335499585095 +175,8,5,6.01843425497474 +176,1,5,-0.658535943004998 +176,2,5,-0.230320331136193 +176,3,5,0.776663101309249 +176,4,5,1.06224281867045 +176,5,5,2.48730802469369 +176,6,5,3.45136727914191 +176,7,5,4.82824700388281 +176,8,5,5.29942033580146 +177,1,5,1.4255783453699 +177,2,5,2.14014946272135 +177,3,5,2.44472412661084 +177,4,5,2.83742801918473 +177,5,5,4.62576722313449 +177,6,5,5.328331234428 +177,7,5,6.56915410754912 +177,8,5,7.83258416591696 +178,1,5,-0.650135897405137 +178,2,5,-0.205276274709679 +178,3,5,0.317605110928796 +178,4,5,1.15255629084519 +178,5,5,2.24872774444275 +178,6,5,3.44192562386023 +178,7,5,4.1418005071401 +178,8,5,5.28892135599992 +179,1,5,1.05604366436219 +179,2,5,1.42116900211466 +179,3,5,1.90217779285062 +179,4,5,2.32016885606758 +179,5,5,4.02830768350421 +179,6,5,4.81009267727798 +179,7,5,5.87489091008465 +179,8,5,6.91120869560989 diff --git a/diff_diff/two_stage.py b/diff_diff/two_stage.py index 96eb398f..5255dc9c 100644 --- a/diff_diff/two_stage.py +++ b/diff_diff/two_stage.py @@ -2430,8 +2430,6 @@ def _stage2_static( att = float(coef[0]) # GMM sandwich variance - eps_2 = y_tilde - np.dot(X_2, coef) # Stage 2 residuals - V = self._compute_gmm_variance( df=df, unit=unit, @@ -2443,7 +2441,6 @@ def _stage2_static( delta_hat=delta_hat, kept_cov_mask=kept_cov_mask, X_2=X_2, - eps_2=eps_2, cluster_ids=df[cluster_var].values, survey_weights=survey_weights, resolved_survey=resolved_survey, @@ -2601,7 +2598,6 @@ def _stage2_event_study( weights=survey_weights, weight_type=survey_weight_type, ) - eps_2 = y_tilde - np.dot(X_2, coef) # GMM variance for full coefficient vector V = self._compute_gmm_variance( @@ -2615,7 +2611,6 @@ def _stage2_event_study( delta_hat=delta_hat, kept_cov_mask=kept_cov_mask, X_2=X_2, - eps_2=eps_2, cluster_ids=df[cluster_var].values, survey_weights=survey_weights, resolved_survey=resolved_survey, @@ -2727,7 +2722,6 @@ def _stage2_group( weights=survey_weights, weight_type=survey_weight_type, ) - eps_2 = y_tilde - np.dot(X_2, coef) # GMM variance V = self._compute_gmm_variance( @@ -2741,7 +2735,6 @@ def _stage2_group( delta_hat=delta_hat, kept_cov_mask=kept_cov_mask, X_2=X_2, - eps_2=eps_2, cluster_ids=df[cluster_var].values, survey_weights=survey_weights, resolved_survey=resolved_survey, @@ -2832,7 +2825,6 @@ def _compute_gmm_variance( delta_hat: Optional[np.ndarray], kept_cov_mask: Optional[np.ndarray], X_2: np.ndarray, - eps_2: np.ndarray, cluster_ids: np.ndarray, survey_weights: Optional[np.ndarray] = None, resolved_survey=None, @@ -2865,9 +2857,9 @@ def _compute_gmm_variance( Parameters ---------- X_2 : np.ndarray, shape (n, k) - Stage 2 design matrix (treatment indicators). - eps_2 : np.ndarray, shape (n,) - Stage 2 residuals. + Stage 2 design matrix (treatment indicators). The Stage-2 residual + ``eps_2`` is re-solved internally from the *exact* Stage-1 residuals + (see the exact-residual note below), so it is not a parameter. cluster_ids : np.ndarray, shape (n,) Cluster identifiers, fit-sample length. Used for the per-cluster stage-1 / stage-2 score aggregation (OLS path). @@ -2923,9 +2915,17 @@ def _compute_gmm_variance( # eps_10 = Y - X_10 @ gamma_hat # Untreated: stage 1 residual (Y - fitted). Treated: Y (X_10 rows = 0). - # Reconstruct Y from y_tilde: Y = y_tilde + fitted_stage1 + # Reconstruct Y from y_tilde: Y = y_tilde + fitted_stage1. Because + # y_tilde = Y - fitted_1, the iterative FE in fitted_1 cancel exactly, so + # y_vals == Y (independent of the iterative solver's tolerance). alpha_i = df[unit].map(unit_fe).values beta_t = df[time].map(time_fe).values + # Identification mask: obs whose unit AND time FE are both identified by the + # untreated Stage-1 fit. Rank-deficient / Proposition-5 obs (NaN FE) keep the + # iterative-residual behavior; only identified obs get the exact residuals. + identified = np.isfinite(np.asarray(alpha_i, dtype=float)) & np.isfinite( + np.asarray(beta_t, dtype=float) + ) alpha_i = np.where(pd.isna(alpha_i), 0.0, alpha_i).astype(float) beta_t = np.where(pd.isna(beta_t), 0.0, beta_t).astype(float) fitted_1 = alpha_i + beta_t @@ -2937,21 +2937,22 @@ def _compute_gmm_variance( y_tilde = df["_y_tilde"].values y_vals = y_tilde + fitted_1 # reconstruct Y + y_vals_clean = np.nan_to_num(y_vals, nan=0.0) - # eps_10: for untreated, stage 1 residual; for treated, Y_i (since X_10 rows = 0) - eps_10 = np.empty(n) omega_0 = omega_0_mask.values - eps_10[omega_0] = y_vals[omega_0] - fitted_1[omega_0] # Stage 1 residual - eps_10[~omega_0] = y_vals[~omega_0] # x_{10i} = 0, so eps_10 = Y # 1. gamma_hat = (X'_{10} W X_{10})^{-1} (X'_1 W X_2) [p x k] - # With survey weights, both cross-products need W + # With survey weights, both cross-products need W. We reuse the SAME + # factorization of (X'_{10} W X_{10}) to also solve the exact Stage-1 FE + # coefficients theta_exact (see exact-residual note below). if survey_weights is not None: XtWX_10 = X_10_sparse.T @ X_10_sparse.multiply(survey_weights[:, None]) Xt1_WX2 = X_1_sparse.T @ (X_2 * survey_weights[:, None]) + rhs_fe = X_10_sparse.T @ (survey_weights * y_vals_clean) else: XtWX_10 = X_10_sparse.T @ X_10_sparse # (p x p) sparse Xt1_WX2 = X_1_sparse.T @ X_2 # (p x k) dense + rhs_fe = X_10_sparse.T @ y_vals_clean # (p,) X'_{10} W Y try: solve_XtX = sparse_factorized(XtWX_10.tocsc()) @@ -2961,6 +2962,7 @@ def _compute_gmm_variance( gamma_hat = np.column_stack( [solve_XtX(Xt1_WX2[:, j]) for j in range(Xt1_WX2.shape[1])] ) + theta_exact = np.asarray(solve_XtX(np.asarray(rhs_fe).ravel())).ravel() except RuntimeError as exc: # Singular matrix — fall back to dense least-squares. Silent-failure # audit axis C: emit a UserWarning on fallback instead of swallowing. @@ -2973,9 +2975,32 @@ def _compute_gmm_variance( UserWarning, stacklevel=2, ) - gamma_hat = np.linalg.lstsq(XtWX_10.toarray(), Xt1_WX2, rcond=None)[0] + XtWX_10_dense = XtWX_10.toarray() + gamma_hat = np.linalg.lstsq(XtWX_10_dense, Xt1_WX2, rcond=None)[0] if gamma_hat.ndim == 1: gamma_hat = gamma_hat.reshape(-1, 1) + theta_exact = np.linalg.lstsq(XtWX_10_dense, np.asarray(rhs_fe).ravel(), rcond=None)[0] + + # Exact Stage-1 / Stage-2 residuals. The point-estimate path uses the + # iterative alternating-projection FE solver (`_iterative_fe`), which + # converges only to ~1e-7 on unbalanced untreated panels; that error is + # negligible for the ATT but perturbs the variance by ~1% relative to the + # analytical GMM sandwich. The variance therefore re-solves the Stage-1 FE + # EXACTLY using the sparse normal equations already factorized for gamma_hat + # (theta_exact), matching R `did2s` to ~1e-7 and mirroring ImputationDiD's + # exact-sparse variance path. The shared `_exact_gmm_residuals` helper is + # used by BOTH this analytical path and the multiplier bootstrap + # (`_compute_cluster_S_scores`) so the influence function is single-sourced. + eps_10, eps_2 = self._exact_gmm_residuals( + X_1_sparse, + theta_exact, + y_vals_clean, + identified, + omega_0, + y_tilde, + X_2, + survey_weights, + ) # 2. Per-cluster Stage 1 scores: c_g = sum_{i in g} w_i * x_{10i} * eps_{10i} # Only untreated obs have non-zero X_10 rows @@ -3186,8 +3211,13 @@ def _build_fe_design( """ Build sparse FE design matrices X_1 (all obs) and X_10 (untreated rows only). - Column layout: [unit_0, ..., unit_{U-2}, time_0, ..., time_{T-2}, cov_1, ..., cov_C] - (Drop first unit and first time for identification.) + Column layout: [intercept, unit_1, ..., unit_{U-1}, time_1, ..., time_{T-1}, + cov_1, ..., cov_C] (drop first unit and first time for identification, with an + intercept). The intercept makes the column space span the constant (the grand + mean); the prior intercept-free layout silently omitted the grand mean from the + FE span, which biased the GMM-sandwich residuals when re-solved exactly. With + the intercept this is the standard full-rank two-way FE (matches fixest / R + ``did2s``). X_10 is identical to X_1 except that rows for treated observations are zeroed out. @@ -3210,30 +3240,38 @@ def _build_fe_design( n_units = len(all_units) n_times = len(all_times) n_cov = len(covariates) if covariates else 0 - n_fe_cols = (n_units - 1) + (n_times - 1) + # [intercept, unit_1..unit_{U-1}, time_1..time_{T-1}] — the intercept (col 0) + # makes the column space span the constant / grand mean (see docstring). + n_fe_cols = 1 + (n_units - 1) + (n_times - 1) def _build_rows(mask=None): """Build sparse matrix for given observation mask.""" - # Unit dummies (drop first) + all_rows = np.arange(n) + + # Intercept (col 0): 1 for every (masked) row. + i_rows = all_rows if mask is None else all_rows[mask] + i_cols = np.zeros(len(i_rows), dtype=int) + + # Unit dummies (drop first) at cols 1..n_units-1 u_indices = np.array([unit_to_idx[u] for u in unit_vals]) u_mask = u_indices > 0 if mask is not None: u_mask = u_mask & mask - u_rows = np.arange(n)[u_mask] - u_cols = u_indices[u_mask] - 1 + u_rows = all_rows[u_mask] + u_cols = u_indices[u_mask] # 1..n_units-1 (intercept occupies col 0) - # Time dummies (drop first) + # Time dummies (drop first) at cols n_units..n_units+n_times-2 t_indices = np.array([time_to_idx[t] for t in time_vals]) t_mask = t_indices > 0 if mask is not None: t_mask = t_mask & mask - t_rows = np.arange(n)[t_mask] - t_cols = (n_units - 1) + t_indices[t_mask] - 1 + t_rows = all_rows[t_mask] + t_cols = n_units + t_indices[t_mask] - 1 - rows = np.concatenate([u_rows, t_rows]) - cols = np.concatenate([u_cols, t_cols]) + rows = np.concatenate([i_rows, u_rows, t_rows]) + cols = np.concatenate([i_cols, u_cols, t_cols]) data = np.ones(len(rows)) A_fe = sparse.csr_matrix((data, (rows, cols)), shape=(n, n_fe_cols)) diff --git a/diff_diff/two_stage_bootstrap.py b/diff_diff/two_stage_bootstrap.py index 2a2a693d..e21aae42 100644 --- a/diff_diff/two_stage_bootstrap.py +++ b/diff_diff/two_stage_bootstrap.py @@ -22,7 +22,6 @@ from diff_diff.bootstrap_utils import ( generate_survey_multiplier_weights_batch as _generate_survey_multiplier_weights_batch, ) -from diff_diff.linalg import solve_ols from diff_diff.two_stage_results import TwoStageBootstrapResults # Maximum number of elements before falling back to per-column sparse aggregation. @@ -63,6 +62,66 @@ def _compute_gmm_scores( s2_by_cluster: np.ndarray, ) -> np.ndarray: ... + @staticmethod + def _exact_gmm_residuals( + X_1_sparse, + theta_exact: np.ndarray, + y_vals_clean: np.ndarray, + identified: np.ndarray, + omega_0: np.ndarray, + y_tilde: np.ndarray, + X_2: np.ndarray, + survey_weights: Optional[np.ndarray], + ) -> Tuple[np.ndarray, np.ndarray]: + """Exact Stage-1 / Stage-2 residuals for the GMM influence function. + + Given the EXACT Stage-1 FE coefficients ``theta_exact`` (solved from the + same ``(X'_{10} W X_{10})`` factorization used for ``gamma_hat``), return the + exact Stage-1 residual ``eps_10`` (untreated rows) and the exact Stage-2 + residual ``eps_2``. **Shared** by the analytical GMM variance + (``TwoStageDiD._compute_gmm_variance``) and the multiplier bootstrap + (``_compute_cluster_S_scores``) so both build the per-cluster influence + score ``S_g = gamma_hat' c_g - X'_{2g} eps_{2g}`` from the same exact + residuals. The iterative alternating-projection FE used for the point + estimate is only ~1e-7-accurate on unbalanced untreated panels, which + perturbs the variance ~1% relative to the analytical sandwich; obs whose FE + are unidentified (rank-deficient / Proposition-5) fall back to the iterative + residual ``y_tilde`` so those edge cases are unchanged. + """ + n = X_1_sparse.shape[0] + fitted_exact = np.asarray(X_1_sparse @ theta_exact).ravel() + y_tilde_exact = y_vals_clean - fitted_exact + use_exact = identified & np.isfinite(y_tilde_exact) + y_tilde_use = np.where(use_exact, y_tilde_exact, y_tilde) + eps_10 = np.empty(n) + eps_10[omega_0] = y_tilde_use[omega_0] # exact Stage-1 residual (untreated) + eps_10[~omega_0] = y_vals_clean[~omega_0] # x_{10i} = 0, so value is inert + # Exact Stage-2 residual: re-solve delta on the exact residualized outcome + # (X_2 already has NaN-y_tilde rows zeroed by the caller, so masked obs + # contribute nothing to the normal equations). + y_tilde_s2 = np.where(np.isfinite(y_tilde_use), y_tilde_use, 0.0) + if survey_weights is not None: + XtWX2 = X_2.T @ (X_2 * survey_weights[:, None]) + XtWy2 = X_2.T @ (survey_weights * y_tilde_s2) + else: + XtWX2 = X_2.T @ X_2 + XtWy2 = X_2.T @ y_tilde_s2 + try: + delta_2 = np.linalg.solve(XtWX2, XtWy2) + except np.linalg.LinAlgError: + # Silent-failure audit convention: warn before the dense fallback. + warnings.warn( + "TwoStageDiD GMM sandwich: Stage-2 design (X'_2 W X_2) is " + "singular; falling back to dense lstsq for the exact-residual " + "re-solve. This may indicate collinear treatment/horizon " + "indicators and SE estimates may be less reliable.", + UserWarning, + stacklevel=2, + ) + delta_2 = np.linalg.lstsq(XtWX2, XtWy2, rcond=None)[0] + eps_2 = y_tilde_s2 - X_2 @ delta_2 + return eps_10, eps_2 + def _compute_cluster_S_scores( self, df: pd.DataFrame, @@ -75,7 +134,6 @@ def _compute_cluster_S_scores( delta_hat: Optional[np.ndarray], kept_cov_mask: Optional[np.ndarray], X_2: np.ndarray, - eps_2: np.ndarray, cluster_ids: np.ndarray, survey_weights: Optional[np.ndarray] = None, ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: @@ -91,7 +149,6 @@ def _compute_cluster_S_scores( unique_clusters : np.ndarray Unique cluster identifiers. """ - n = len(df) k = X_2.shape[1] cov_list = covariates @@ -103,9 +160,14 @@ def _compute_cluster_S_scores( ) p = X_1_sparse.shape[1] - # Reconstruct Y and compute eps_10 + # Reconstruct Y = y_tilde + fitted_1 (the iterative FE cancel, so y_vals == Y). alpha_i = df[unit].map(unit_fe).values beta_t = df[time].map(time_fe).values + # Obs whose unit AND time FE are both identified by the untreated Stage-1 + # fit; unidentified obs fall back to the iterative residual in the helper. + identified = np.isfinite(np.asarray(alpha_i, dtype=float)) & np.isfinite( + np.asarray(beta_t, dtype=float) + ) alpha_i = np.where(pd.isna(alpha_i), 0.0, alpha_i).astype(float) beta_t = np.where(pd.isna(beta_t), 0.0, beta_t).astype(float) fitted_1 = alpha_i + beta_t @@ -116,20 +178,20 @@ def _compute_cluster_S_scores( fitted_1 = fitted_1 + np.dot(df[cov_list].values, delta_hat) y_tilde = df["_y_tilde"].values - y_vals = y_tilde + fitted_1 - - eps_10 = np.empty(n) + y_vals_clean = np.nan_to_num(y_tilde + fitted_1, nan=0.0) omega_0 = omega_0_mask.values - eps_10[omega_0] = y_vals[omega_0] - fitted_1[omega_0] - eps_10[~omega_0] = y_vals[~omega_0] - # gamma_hat — with survey weights, both cross-products need W + # gamma_hat — with survey weights, both cross-products need W. The same + # (X'_{10} W X_{10}) factorization also yields the EXACT Stage-1 FE + # coefficients theta_exact for the exact-residual helper below. if survey_weights is not None: XtX_10 = X_10_sparse.T @ X_10_sparse.multiply(survey_weights[:, None]) Xt1_X2 = X_1_sparse.T @ (X_2 * survey_weights[:, None]) + rhs_fe = X_10_sparse.T @ (survey_weights * y_vals_clean) else: XtX_10 = X_10_sparse.T @ X_10_sparse Xt1_X2 = X_1_sparse.T @ X_2 + rhs_fe = X_10_sparse.T @ y_vals_clean try: solve_XtX = sparse_factorized(XtX_10.tocsc()) @@ -139,6 +201,7 @@ def _compute_cluster_S_scores( gamma_hat = np.column_stack( [solve_XtX(Xt1_X2[:, j]) for j in range(Xt1_X2.shape[1])] ) + theta_exact = np.asarray(solve_XtX(np.asarray(rhs_fe).ravel())).ravel() except RuntimeError as exc: # Silent-failure audit axis C: emit a UserWarning on fallback instead # of swallowing the error. @@ -150,9 +213,25 @@ def _compute_cluster_S_scores( UserWarning, stacklevel=2, ) - gamma_hat = np.linalg.lstsq(XtX_10.toarray(), Xt1_X2, rcond=None)[0] + XtX_10_dense = XtX_10.toarray() + gamma_hat = np.linalg.lstsq(XtX_10_dense, Xt1_X2, rcond=None)[0] if gamma_hat.ndim == 1: gamma_hat = gamma_hat.reshape(-1, 1) + theta_exact = np.linalg.lstsq(XtX_10_dense, np.asarray(rhs_fe).ravel(), rcond=None)[0] + + # Exact Stage-1 / Stage-2 residuals (shared with the analytical variance) so + # the bootstrap influence function uses the same exact residuals as + # _compute_gmm_variance (not the ~1e-7 iterative residualized outcome). + eps_10, eps_2 = self._exact_gmm_residuals( + X_1_sparse, + theta_exact, + y_vals_clean, + identified, + omega_0, + y_tilde, + X_2, + survey_weights, + ) # Per-cluster aggregation — survey weights multiply eps_10 before sparse multiply if survey_weights is not None: @@ -307,10 +386,8 @@ def _run_bootstrap( # Extract survey weights for S-score computation and Stage-2 WLS survey_weights: Optional[np.ndarray] = None - survey_weight_type: str = "pweight" if resolved_survey is not None: survey_weights = resolved_survey.weights - survey_weight_type = resolved_survey.weight_type # Handle NaN y_tilde (from unidentified FEs) — matches _stage2_static logic nan_mask = ~np.isfinite(y_tilde) @@ -326,14 +403,6 @@ def _run_bootstrap( return None X_2_static = D.reshape(-1, 1) - coef_static = solve_ols( - X_2_static, - y_tilde, - return_vcov=False, - weights=survey_weights, - weight_type=survey_weight_type, - )[0] - eps_2_static = y_tilde - np.dot(X_2_static, coef_static) S_static, bread_static, unique_clusters = self._compute_cluster_S_scores( df=df, @@ -346,7 +415,6 @@ def _run_bootstrap( delta_hat=delta_hat, kept_cov_mask=kept_cov_mask, X_2=X_2_static, - eps_2=eps_2_static, cluster_ids=cluster_ids, survey_weights=survey_weights, ) @@ -485,15 +553,6 @@ def _run_bootstrap( if h_int in horizon_to_col: X_2_es[i, horizon_to_col[h_int]] = 1.0 - coef_es = solve_ols( - X_2_es, - y_tilde, - return_vcov=False, - weights=survey_weights, - weight_type=survey_weight_type, - )[0] - eps_2_es = y_tilde - np.dot(X_2_es, coef_es) - S_es, bread_es, _ = self._compute_cluster_S_scores( df=df, unit=unit, @@ -505,7 +564,6 @@ def _run_bootstrap( delta_hat=delta_hat, kept_cov_mask=kept_cov_mask, X_2=X_2_es, - eps_2=eps_2_es, cluster_ids=cluster_ids, survey_weights=survey_weights, ) @@ -556,15 +614,6 @@ def _run_bootstrap( if g in group_to_col: X_2_grp[i, group_to_col[g]] = 1.0 - coef_grp = solve_ols( - X_2_grp, - y_tilde, - return_vcov=False, - weights=survey_weights, - weight_type=survey_weight_type, - )[0] - eps_2_grp = y_tilde - np.dot(X_2_grp, coef_grp) - S_grp, bread_grp, _ = self._compute_cluster_S_scores( df=df, unit=unit, @@ -576,7 +625,6 @@ def _run_bootstrap( delta_hat=delta_hat, kept_cov_mask=kept_cov_mask, X_2=X_2_grp, - eps_2=eps_2_grp, cluster_ids=cluster_ids, survey_weights=survey_weights, ) diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md index ce725563..82cd60dd 100644 --- a/docs/methodology/REGISTRY.md +++ b/docs/methodology/REGISTRY.md @@ -1426,6 +1426,7 @@ Our implementation uses multiplier bootstrap on the GMM influence function: clus - **Note:** Both the iterative FE solver (`_iterative_fe`, Stage 1) and the iterative alternating-projection demeaning helper (`_iterative_demean`, used in covariate residualization) emit `UserWarning` when `max_iter` exhausts without reaching `tol`, via `diff_diff.utils.warn_if_not_converged`. Silent return of the current iterate was classified as a silent failure under the Phase 2 audit and replaced with an explicit signal to match the logistic/Poisson IRLS pattern in `linalg.py`. - **Note:** When the Stage-2 bread `X'_2 W X_2` is singular, both the analytical TSL variance (`two_stage.py`) and the multiplier-bootstrap bread (`two_stage_bootstrap.py`) now emit a `UserWarning` before falling back to `np.linalg.lstsq`. Previously this fallback was silent. Sibling of axis-A finding #17 in the Phase 2 silent-failures audit; surfaced by the repo-wide lstsq-fallback pattern grep that accompanied the StaggeredTripleDifference fix. - **Note:** The GMM sandwich and bootstrap paths both use `scipy.sparse.linalg.factorized` for the Stage 1 normal-equations solve `(X'_{10} W X_{10}) gamma = X'_1 W X_2` and fall back to dense `lstsq` when the sparse factorization raises `RuntimeError` on a near-singular matrix. Both fallback sites emit a `UserWarning` (silent-failure audit axis C) so callers know SE estimates came from the degraded path rather than the fast sparse path. +- **Note:** The GMM sandwich re-solves the Stage-1 unit+time fixed effects **exactly** (sparse OLS reusing the `scipy.sparse.linalg.factorized` factorization of `(X'_{10} W X_{10})` already computed for `gamma_hat`), rather than reusing the iterative alternating-projection FE (`_iterative_fe`) that produces the point estimate. The iterative solver converges only to ~1e-7 on unbalanced *untreated* panels — negligible for the ATT, but enough to perturb the variance by ~1% relative to the analytical GMM sandwich. The exact re-solve makes the analytical GMM SE match R `did2s` to ~1e-7 (`tests/test_methodology_two_stage.py::TestTwoStageDiDParityR`), mirroring ImputationDiD's exact-sparse variance path (`imputation.py` `_build_A_sparse`). It also required adding an **intercept column** to `_build_fe_design` so the first-stage column space spans the constant (the grand mean): the prior intercept-free `[unit_1.., time_1..]` layout (drop first unit + first time, no intercept) silently omitted the grand mean, which the exact residual is first-order sensitive to (the iterative point-estimate solver absorbs the grand mean into its mean-based FE, so the point estimate was unaffected). Obs whose unit or time FE are unidentified (NaN; rank-deficient / Proposition-5) fall back to the iterative residual, so those edge cases are unchanged. The reported `overall_att` still uses the iterative FE (preserving point-estimate equivalence with ImputationDiD at 1e-10); only the variance uses the exact residuals. - **Note:** `vcov_type` is permanently narrow to `{"hc1"}` (Phase 1b threading). TwoStageDiD's variance is the Gardner (2022) two-stage GMM cluster-sandwich `V = (X'_2 W X_2)^{-1} (S' S) (X'_2 W X_2)^{-1}` with the per-cluster GMM-corrected score `S_g = gamma_hat' c_g - X'_{2g} eps_{2g}`. Analytical-sandwich families `{classical, hc2, hc2_bm}` are rejected at `__init__`/`fit()`: the GMM-corrected meat folds first-stage FE estimation uncertainty into the score via the `gamma_hat' c_g` term, so there is no single hat matrix spanning both stages on which HC2 leverage or Bell-McCaffrey Satterthwaite DOF can be defined, and the Gardner first-stage correction has not been derived for the leverage-corrected or homoskedastic meat structures (no reference implementation — `clubSandwich` covers single-equation WLS/OLS CR2, not two-stage GMM; mirrors the SpilloverDiD `vcov_type="classical"` rejection). `cluster=` selects the cluster level; `cluster=None` (the default) still clusters at the `unit` column (`cluster_var = unit`), so the summary renders `"CR1 cluster-robust at , G="` rather than the generic `"HC1"` label. **Note (deviation from R):** the did2s GMM sandwich uses NO finite-sample multiplier (meat `= S' S`), so the rendered `CR1` family label carries no Stata-style `(n-1)/(n-p)` or `G/(G-1)` factor (matches R `did2s`; same FSA-free convention as ImputationDiD's Theorem 3 variance). Under bootstrap (`n_bootstrap > 0`) the analytical variance-family label is suppressed in `summary()` because `fit()` overwrites the reported SE/CI/p-value with `bootstrap_results` (mirrors `DiDResults` at `results.py:213-226`). `cluster=` combined with a replicate-weight survey design raises `NotImplementedError` (replicate-refit variance ignores `cluster=`). `vcov_type='conley'` is deferred to the TwoStageDiD Conley follow-up row in TODO.md. **Reference implementation(s):** @@ -1438,6 +1439,7 @@ Our implementation uses multiplier bootstrap on the GMM influence function: clus - [x] GMM sandwich variance (Newey & McFadden 1994 Theorem 6.1) - [x] Global `(D'D)^{-1}` in variance (faithful to Gardner §3.3 / Newey-McFadden GMM sandwich; matches R `did2s`) - [x] No finite-sample adjustment (raw asymptotic sandwich) +- [x] Analytical GMM SE matches R `did2s` to ~1e-7 (exact Stage-1 re-solve; `TestTwoStageDiDParityR`) - [x] Always-treated units excluded with warning - [x] Multiplier bootstrap on GMM influence function - [x] Event study and overall ATT aggregation diff --git a/docs/methodology/papers/gardner-2022-review.md b/docs/methodology/papers/gardner-2022-review.md index 1a626147..322ea07b 100644 --- a/docs/methodology/papers/gardner-2022-review.md +++ b/docs/methodology/papers/gardner-2022-review.md @@ -102,9 +102,9 @@ where `W_gpit = [Y_gpit, (1g)_gpit, (1p)_gpit, D_gp]`. The first moment block is - [x] **Variance = joint-GMM Newey-McFadden Thm 6.1 sandwich with the GLOBAL Jacobian inverse, meat clustered at the unit** (Appendix B `vce(cluster id)`); **no finite-sample multiplier**. Library matches this (the global-bread sandwich `(X'₂WX₂)^{-1}(S'S)(X'₂WX₂)^{-1}` with unit-clustered GMM-corrected score `S_g = γ̂'c_g − X'_{2g}ε_{2g}`). - [x] Always-treated exclusion (fn 19); covariates in both stages (fn 9, `covariates=` — the in-both-regressions approach); `anticipation` handling. - ⚠️ **Paper-permitted but NOT exposed by `TwoStageDiD`** (recorded as scope, not a deliverable): the **`P̄`-average estimand eq. (5)** (duration-restricted Stage-2 sample) and the **full-sample first-stage variant (fn 8)** — both described in the paper as valid modifications, but the library has no public parameter for either (`get_params()` = `anticipation, alpha, cluster, n_bootstrap, bootstrap_weights, seed, rank_deficient_action, horizon_max, pretrends, vcov_type`; Stage 1 is always untreated-only). The fn-9 last-pre-treatment / doubly-robust covariate variants are likewise not exposed (only the in-both-stages approach is). -- [ ] **Dedicated `tests/test_methodology_two_stage.py`** with eq./section-numbered Verified Components (Stage-1 FE recovery; Stage-2 overall ATT eq. 4 + event-study eq. 6; GMM first-stage-correction behavior; always-treated drop). — *PR-B.* -- [ ] **R parity fixture vs `did2s`** (`benchmarks/R/generate_did2s_golden.R` + `benchmarks/data/did2s_golden.json` + `did2s_test_panel.csv`); overall ATT + event-study ATT/SE. — *PR-B.* -- [x] **`two_stage.py` docstring checked — already accurate.** `_compute_gmm_variance`'s docstring (`two_stage.py:2842-2857`) states the code uses the GLOBAL Hessian inverse matching `did2s` and gives the correct sandwich `V = (X'₂X₂)^{-1}(Σ_g S_g S_g')(X'₂X₂)^{-1}`; it does **not** fabricate a paper-Eq.6 deviation, so **no source correction is needed**. The misattribution was confined to the REGISTRY note (fixed in this PR). +- [x] **Dedicated `tests/test_methodology_two_stage.py`** with eq./section-numbered Verified Components (Stage-1 FE recovery; Stage-2 overall ATT eq. 4 + event-study eq. 6; GMM first-stage-correction behavior; always-treated drop). — *PR-B.* +- [x] **R parity fixture vs `did2s`** (`benchmarks/R/generate_did2s_golden.R` + `benchmarks/data/did2s_golden.json` + `did2s_test_panel.csv`); overall ATT (abs 1e-6) + event-study ATT (1e-6) / SE (1e-7). — *PR-B.* +- [x] **`two_stage.py` docstring/formula faithful; variance made exact in PR-B.** `_compute_gmm_variance`'s docstring (`two_stage.py:2842-2857`) states the code uses the GLOBAL Hessian inverse matching `did2s` and gives the correct sandwich `V = (X'₂X₂)^{-1}(Σ_g S_g S_g')(X'₂X₂)^{-1}` — the *formula* needed no correction (the Eq.6 misattribution was confined to the REGISTRY note, fixed in PR-A). However, the PR-B `did2s` SE parity surfaced a ~1% **numerical** gap: the variance derived its residuals from the *iterative* alternating-projection first-stage FE (`_iterative_fe`, converged to ~1e-7 on unbalanced untreated panels) while computing `gamma_hat` exactly. PR-B corrects this by re-solving the Stage-1 FE **exactly** inside the variance (sparse OLS, reusing the `gamma_hat` factorization) and adding an intercept to `_build_fe_design` so its column space spans the grand mean — yielding ~1e-7 `did2s` SE parity, mirroring ImputationDiD's exact-sparse variance. The point estimate stays on the iterative FE (twin-equivalence preserved). See REGISTRY `## TwoStageDiD` Notes. --- @@ -152,4 +152,4 @@ No regularization, bandwidth, factor-count, or cross-validation tuning (the esti | **"No finite-sample multiplier" deviation note** (REGISTRY `## TwoStageDiD`). | **Accurate and correctly labeled.** Appendix B's `onestep`/`vce(cluster id)` GMM carries no `(n-1)/(n-p)` or `G/(G-1)` factor; the library matches (meat `= S'S`). | **No change** — verified faithful. | | **Version.** Reviewed arXiv:2207.05943**v1** (title page "April 2021"; arXiv-stamped 13 Jul 2022). | Equation numbers are this version's. The misattributions above are **independent of versioning**: even if a later version renumbered the variance to "(6)", the per-cluster-inverse claim is false regardless (the variance uses a global inverse). | Citation pinned to **Gardner (2022)** per `docs/references.rst`. | -**PR-B deliverables** (tracked in `TODO.md` Testing/Docs): `tests/test_methodology_two_stage.py` (Stage-1/Stage-2/GMM-correction/always-treated Verified Components); `did2s` R parity fixture; then flip the `METHODOLOGY_REVIEW.md` `TwoStageDiD` row **In Progress → Complete**. (No `two_stage.py` source change is required — the docstring is already accurate; the only documentation defect, the REGISTRY misattributions, is fixed in this PR.) +**PR-B deliverables** (tracked in `TODO.md` Testing/Docs): `tests/test_methodology_two_stage.py` (Stage-1/Stage-2/GMM-correction/always-treated Verified Components); `did2s` R parity fixture; then flip the `METHODOLOGY_REVIEW.md` `TwoStageDiD` row **In Progress → Complete**. (The variance *formula* and docstring needed no correction; the only documentation defect, the REGISTRY misattributions, was fixed in PR-A. PR-B did make one **numerical-accuracy** source change — re-solving the Stage-1 FE exactly inside `_compute_gmm_variance` + an intercept in `_build_fe_design` — so the analytical GMM SE matches R `did2s` to ~1e-7; see the checklist item above and REGISTRY `## TwoStageDiD` Notes.) diff --git a/tests/test_methodology_two_stage.py b/tests/test_methodology_two_stage.py new file mode 100644 index 00000000..5adc6d80 --- /dev/null +++ b/tests/test_methodology_two_stage.py @@ -0,0 +1,820 @@ +"""Methodology verification tests for TwoStageDiD. + +Targets Gardner, J. (2022), *Two-stage differences in differences*, +arXiv:2207.05943 [econ.EM]. Each Verified Component class maps to a numbered +section / equation of the paper, verified against the source PDF in +``docs/methodology/papers/gardner-2022-review.md``: + +- **§3, eqs. (4) / (6)** — the two-stage procedure (Stage 1 fits unit+time FE on + the untreated set Omega_0 only; Stage 2 regresses the residualized outcome on + treatment indicators) recovers the overall ATT ``E(beta_gp | D_gp = 1)`` (eq. 4) + and the event-study horizons (Step 2' / eq. 6) under arbitrary heterogeneity + (``TestGardner2022Section3TwoStageProcedure``). +- **§3.3 + Appendix B** — the joint-GMM Newey-McFadden (1994) Theorem 6.1 + sandwich with the GLOBAL Jacobian inverse, meat clustered at the unit, NO + finite-sample multiplier; the variance folds in Stage-1 estimation error via + the ``gamma_hat' c_g`` correction (``TestGardner2022Section33GMMVariance``). +- **fn. 19 + Proposition 5 of Borusyak et al. (2024)** — always-treated units are + excluded (no untreated obs for Stage 1); without never-treated units horizons + ``h >= h_bar = max(groups) - min(groups)`` are not identified -> NaN + warning; + REGISTRY edge cases ``balance_e`` and zero-obs cohorts + (``TestGardner2022Identification``). +- Library extensions / deviations (multiplier bootstrap on the GMM influence + function; ``vcov_type`` narrowed to the GMM sandwich) -- Gardner prescribes + analytical GMM SEs only (``TestGardner2022LibraryDeviations``). + +R-parity (bottom of file, NOT a methodology walk-through): ``TestTwoStageDiDParityR`` +pins Python output against R ``did2s::did2s()`` on a fixed-seed golden. R ``did2s`` +defaults to analytical corrected clustered SEs (``bootstrap = FALSE``), the same +GMM sandwich the library computes; see ``docs/methodology/REGISTRY.md`` +``## TwoStageDiD``. + +Point estimates coincide with ImputationDiD (Borusyak, Jaravel & Spiess 2024) by +construction -- *functional* equivalence is covered by +``tests/test_two_stage.py::TestTwoStageDiDEquivalence``; this file asserts the +paper-grounded contract and the did2s cross-language parity. + +See also: + +- ``docs/methodology/papers/gardner-2022-review.md`` (primary-source review) +- ``docs/methodology/REGISTRY.md`` ``## TwoStageDiD`` block +- ``METHODOLOGY_REVIEW.md`` ``TwoStageDiD`` section +- ``tests/test_two_stage.py`` (implementation-detail unit tests) +- ``benchmarks/R/generate_did2s_golden.R`` (R golden generator) +""" + +from __future__ import annotations + +import json +import warnings +from pathlib import Path +from typing import Any, Dict, List, Optional + +import numpy as np +import pandas as pd +import pytest + +from diff_diff import ImputationDiD, TwoStageDiD + +# ============================================================================= +# Module-level R-fixture availability + per-class seed decorrelation +# ============================================================================= + +GOLDEN_PATH = Path(__file__).parent.parent / "benchmarks" / "data" / "did2s_golden.json" +PANEL_PATH = Path(__file__).parent.parent / "benchmarks" / "data" / "did2s_test_panel.csv" +_R_FIXTURE_AVAILABLE = GOLDEN_PATH.is_file() and PANEL_PATH.is_file() + +_BASE_SEED_PROC = 7101 +_BASE_SEED_VAR = 7202 +_BASE_SEED_IDENT = 7303 +_BASE_SEED_DEV = 7404 + + +# ============================================================================= +# Helpers +# ============================================================================= + + +def _make_staggered_panel( + rng: np.random.Generator, + *, + cohorts: List[int], + n_per_cohort: int = 100, + n_periods: int = 6, + tau_constant: Optional[float] = None, + tau_by_horizon: Optional[Dict[int, float]] = None, + sigma: float = 0.1, + include_never_treated: bool = True, + pretrend_slope: float = 0.0, +) -> pd.DataFrame: + """Balanced staggered-adoption panel satisfying parallel trends. + + DGP (Gardner eq. 1): ``y_it = c_i + beta_t + w_it * tau_{K_it} + u_it``, + with ``c_i ~ N(0,1)``, common time trend ``beta_t = 0.5 t`` (parallel + trends hold -- no cohort-specific trends unless ``pretrend_slope != 0``), + ``u_it ~ N(0, sigma^2)``. Treatment is absorbing from the cohort's event + date. ``first_treat = 0`` denotes never-treated. + """ + if tau_constant is None and tau_by_horizon is None: + tau_constant = 1.0 + rows: List[Dict[str, Any]] = [] + unit_id = 0 + all_cohorts = ([0] + list(cohorts)) if include_never_treated else list(cohorts) + cohort_rank = {g: r for r, g in enumerate(sorted(cohorts))} + for g in all_cohorts: + for _ in range(n_per_cohort): + c_i = rng.standard_normal() + for t in range(1, n_periods + 1): + beta_t = 0.5 * t + u = sigma * rng.standard_normal() + treated = g > 0 and t >= g + if treated: + k = t - g + if tau_by_horizon is not None: + tau = tau_by_horizon.get(k, 0.0) + else: + tau = tau_constant if tau_constant is not None else 0.0 + else: + tau = 0.0 + trend = pretrend_slope * cohort_rank.get(g, 0) * t if g > 0 else 0.0 + y = c_i + beta_t + trend + (tau if treated else 0.0) + u + rows.append( + { + "unit": unit_id, + "time": t, + "first_treat": g, + "outcome": y, + } + ) + unit_id += 1 + return pd.DataFrame(rows) + + +# ============================================================================= +# Section 3, eqs. (4) / (6) — the two-stage procedure +# ============================================================================= + + +class TestGardner2022Section3TwoStageProcedure: + """§3 (eqs. 4/6): Stage 1 fits unit+time FE on the untreated subsample + (Omega_0), Stage 2 regresses the residualized outcome on treatment status; + this recovers the overall ATT (eq. 4) and event-study horizons (eq. 6).""" + + def test_recovers_overall_att_eq4(self) -> None: + """Under a constant effect tau=2.0, the overall ATT E(beta|D=1) (eq. 4) + is recovered. DGP: 2 cohorts + never-treated, N=300, sigma=0.1.""" + rng = np.random.default_rng(_BASE_SEED_PROC + 1) + panel = _make_staggered_panel( + rng, cohorts=[3, 4], n_per_cohort=100, tau_constant=2.0, sigma=0.1 + ) + res = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + assert abs(res.overall_att - 2.0) < 0.05 + + def test_recovers_heterogeneous_event_study_eq6(self) -> None: + """Horizon-specific effects tau_K = 1 + 0.5*K are recovered per horizon + via the Step 2' event-study spec (eq. 6).""" + rng = np.random.default_rng(_BASE_SEED_PROC + 2) + tau_by_h = {0: 1.0, 1: 1.5, 2: 2.0, 3: 2.5} + panel = _make_staggered_panel( + rng, cohorts=[2, 3], n_per_cohort=120, tau_by_horizon=tau_by_h, sigma=0.1 + ) + res = TwoStageDiD().fit( + panel, + outcome="outcome", + unit="unit", + time="time", + first_treat="first_treat", + aggregate="event_study", + ) + assert res.event_study_effects is not None + for h, expected in tau_by_h.items(): + assert h in res.event_study_effects, f"missing horizon {h}" + got = res.event_study_effects[h]["effect"] + assert abs(got - expected) < 0.06, f"h={h}: {got:.4f} vs {expected}" + + def test_stage1_uses_untreated_only(self) -> None: + """Perturbing a single treated outcome by delta shifts the overall ATT by + exactly delta/N_treated -- proving treated observations never feed back + into the Stage-1 counterfactual model (FE fit on Omega_0 only).""" + rng = np.random.default_rng(_BASE_SEED_PROC + 3) + panel = _make_staggered_panel( + rng, cohorts=[3, 4], n_per_cohort=60, tau_constant=1.0, sigma=0.1 + ) + base = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + n_treated = int( + ((panel["first_treat"] > 0) & (panel["time"] >= panel["first_treat"])).sum() + ) + + perturbed = panel.copy() + treated_idx = perturbed.index[ + (perturbed["first_treat"] > 0) & (perturbed["time"] >= perturbed["first_treat"]) + ][0] + delta = 100.0 + perturbed.loc[treated_idx, "outcome"] += delta + pert = TwoStageDiD().fit( + perturbed, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + # Only the perturbed obs's own residual changes (Stage-2 weight 1/N_treated). + assert abs((pert.overall_att - base.overall_att) - delta / n_treated) < 1e-6 + + def test_coincides_with_imputation_estimand(self) -> None: + """Gardner's two-stage estimator coincides in point estimates with the + imputation estimator (BJS 2024 p. 3258 / paper review "Relation"). The + overall ATT matches ImputationDiD to machine precision (they differ only + in the variance estimator).""" + rng = np.random.default_rng(_BASE_SEED_PROC + 4) + panel = _make_staggered_panel( + rng, cohorts=[2, 4], n_per_cohort=70, n_periods=6, tau_constant=1.5, sigma=0.2 + ) + ts = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + imp = ImputationDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + assert ts.overall_att == pytest.approx(imp.overall_att, abs=1e-10) + + def test_covariates_residualized_in_both_stages(self) -> None: + """The live ``covariates=`` path (fn. 9 in-both-stages): a time-varying + covariate correlated with treatment status would bias the ATT if ignored, + but residualizing it in both stages (Stage-1 delta_hat fit on Omega_0, + subtracted from all obs) recovers the true ATT. + + DGP: y = c_i + beta_t + theta*x + tau*D + u, with x shifted by +s on + treated-post obs (so x is correlated with D). Omitting x leaves theta*x in + y_tilde -> loads onto D -> ATT biased by ~theta*s; including x removes it. + """ + rng = np.random.default_rng(_BASE_SEED_PROC + 5) + theta, tau, shift = 2.0, 1.0, 1.0 + rows: List[Dict[str, Any]] = [] + uid = 0 + for g in (0, 3, 4): + for _ in range(80): + c_i = rng.standard_normal() + for t in range(1, 7): + treated = g > 0 and t >= g + x = rng.standard_normal() + (shift if treated else 0.0) + y = ( + c_i + + 0.5 * t + + theta * x + + (tau if treated else 0.0) + + 0.1 * rng.standard_normal() + ) + rows.append({"unit": uid, "time": t, "first_treat": g, "x": x, "outcome": y}) + uid += 1 + panel = pd.DataFrame(rows) + + att_no_cov = ( + TwoStageDiD() + .fit(panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat") + .overall_att + ) + att_cov = ( + TwoStageDiD() + .fit( + panel, + outcome="outcome", + unit="unit", + time="time", + first_treat="first_treat", + covariates=["x"], + ) + .overall_att + ) + # Without adjustment, the confounded covariate biases the ATT well away + # from tau; with adjustment, the true ATT is recovered. + assert abs(att_no_cov - tau) > 0.5, f"expected bias, got {att_no_cov:.4f}" + assert abs(att_cov - tau) < 0.05, f"covariate path failed: {att_cov:.4f}" + + +# ============================================================================= +# Section 3.3 + Appendix B — joint-GMM Newey-McFadden Theorem 6.1 variance +# ============================================================================= + + +class TestGardner2022Section33GMMVariance: + """§3.3 + App. B: the GMM sandwich with the GLOBAL Jacobian inverse, meat + clustered at the unit, NO finite-sample multiplier; the variance folds in + Stage-1 estimation error via the gamma_hat' c_g correction.""" + + def test_overall_se_finite_and_positive(self) -> None: + rng = np.random.default_rng(_BASE_SEED_VAR + 1) + panel = _make_staggered_panel( + rng, cohorts=[3, 4], n_per_cohort=80, tau_constant=1.0, sigma=0.2 + ) + res = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + assert np.isfinite(res.overall_se) and res.overall_se > 0 + + def test_event_study_ses_finite(self) -> None: + rng = np.random.default_rng(_BASE_SEED_VAR + 2) + panel = _make_staggered_panel( + rng, cohorts=[2, 3], n_per_cohort=80, tau_constant=1.0, sigma=0.2 + ) + res = TwoStageDiD().fit( + panel, + outcome="outcome", + unit="unit", + time="time", + first_treat="first_treat", + aggregate="event_study", + ) + assert res.event_study_effects is not None + for h, eff in res.event_study_effects.items(): + # Skip the normalized reference period (effect=se=0 by construction). + if h >= 0 and np.isfinite(eff["effect"]): + assert np.isfinite(eff["se"]) and eff["se"] > 0, f"h={h}" + + def test_first_stage_correction_is_nontrivial(self) -> None: + """The GMM sandwich folds Stage-1 FE estimation uncertainty into the score + via the gamma_hat' c_g correction. On a design with limited untreated + support per FE (so the first-stage uncertainty is non-negligible), the + GMM SE is materially LARGER than a naive benchmark that treats the + residualized outcome y_tilde as raw data and ignores first-stage + estimation -- the iid SE of its mean, ``std(y_tilde) / sqrt(N_treated)``. + + Asserts a directional/magnitude gap (not bare inequality) so a + near-homoskedastic coincidence can't pass it. The overall ATT is the mean + of ``tau_hat`` (= y_tilde for treated obs) over treated observations, so + the naive floor is exactly the no-first-stage SE of that mean. + """ + rng = np.random.default_rng(_BASE_SEED_VAR + 3) + # Few periods + modest units -> each time/unit FE rests on little + # untreated support, so the first-stage correction visibly inflates the SE. + panel = _make_staggered_panel( + rng, cohorts=[2, 3], n_per_cohort=40, n_periods=4, tau_constant=1.0, sigma=0.5 + ) + res = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + gmm_se = res.overall_se + + # Naive benchmark: y_tilde for the treated obs is exposed as `tau_hat`; + # the overall ATT is its (unweighted) mean. The SE of that mean IGNORING + # the generated-regressand (first-stage) correction is std/sqrt(N_treated). + tau_hat = res.treatment_effects["tau_hat"].values + tau_hat = tau_hat[np.isfinite(tau_hat)] + naive_floor = float(np.std(tau_hat, ddof=1)) / np.sqrt(len(tau_hat)) + + assert np.isfinite(gmm_se) and gmm_se > 0 + assert gmm_se > naive_floor * 1.05, ( + f"GMM SE {gmm_se:.5f} should exceed the no-first-stage floor " + f"{naive_floor:.5f} by a clear margin (the gamma_hat' c_g correction)" + ) + + def test_no_fsa_global_inverse_se_pin(self) -> None: + """Regression pin of the overall SE on a fixed-seed panel. Locks the + global-inverse + no-finite-sample-multiplier GMM sandwich convention; a + revert to a per-cluster inverse or an FSA multiplier would move this + number. Deterministic given the seed (the SE computation has no + randomness).""" + rng = np.random.default_rng(_BASE_SEED_VAR + 7) + panel = _make_staggered_panel( + rng, cohorts=[2, 4], n_per_cohort=50, n_periods=6, tau_constant=1.0, sigma=0.3 + ) + res = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + assert np.isfinite(res.overall_se) and res.overall_se > 0 + assert res.overall_se == pytest.approx(_SE_PIN, abs=1e-8) + + def test_pretrends_event_study_ses_finite(self) -> None: + """With pretrends=True the Stage-2 design adds pre-period leads; their GMM + SEs are finite and the overall ATT is unchanged (REGISTRY edge case).""" + rng = np.random.default_rng(_BASE_SEED_VAR + 4) + panel = _make_staggered_panel( + rng, cohorts=[3, 4], n_per_cohort=80, n_periods=6, tau_constant=1.0, sigma=0.2 + ) + base = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + res = TwoStageDiD(pretrends=True).fit( + panel, + outcome="outcome", + unit="unit", + time="time", + first_treat="first_treat", + aggregate="event_study", + ) + assert res.event_study_effects is not None + # At least one pre-period lead is present and has a finite SE. + leads = [h for h in res.event_study_effects if h < 0] + assert leads, "pretrends=True should expose pre-period leads" + for h in leads: + eff = res.event_study_effects[h] + if np.isfinite(eff["effect"]): + assert np.isfinite(eff["se"]), f"lead h={h} SE non-finite" + # Overall ATT is unchanged by requesting pre-period leads. + assert res.overall_att == pytest.approx(base.overall_att, abs=1e-8) + + def test_unit_clustered_default(self) -> None: + """cluster=None clusters the meat at the unit level (Appendix B + vce(cluster id)); n_clusters equals the number of units.""" + rng = np.random.default_rng(_BASE_SEED_VAR + 5) + panel = _make_staggered_panel( + rng, cohorts=[3, 4], n_per_cohort=50, tau_constant=1.0, sigma=0.2 + ) + res = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + assert res.n_clusters == panel["unit"].nunique() + + def test_singular_omega0_warns_and_falls_back(self) -> None: + """A rank-deficient Stage-1 design (a period observed only among treated + obs -> its time FE is unidentified in Omega_0 -> X_10'X_10 singular) makes + the sparse factorization in the GMM variance fail; the code must emit a + UserWarning and route to the dense lstsq fallback, still returning a + finite SE.""" + rng = np.random.default_rng(_BASE_SEED_VAR + 9) + rows: List[Dict[str, Any]] = [] + uid = 0 + for g in (0, 2): + for _ in range(30): + c_i = rng.standard_normal() + for t in (1, 2, 3, 4): + if g == 0 and t == 4: + continue # never-treated not observed at t=4 + treated = g > 0 and t >= g + y = c_i + 0.5 * t + (1.0 if treated else 0.0) + 0.1 * rng.standard_normal() + rows.append({"unit": uid, "time": t, "first_treat": g, "outcome": y}) + uid += 1 + panel = pd.DataFrame(rows) + + with warnings.catch_warnings(record=True) as caught: + warnings.simplefilter("always") + res = TwoStageDiD(rank_deficient_action="silent").fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + assert any( + "lstsq" in str(w.message) for w in caught + ), "expected the dense-lstsq fallback warning under a singular Omega_0" + assert np.isfinite(res.overall_se) + + def test_bootstrap_scores_use_exact_residuals(self) -> None: + """The multiplier bootstrap builds its per-cluster GMM scores from the SAME + exact Stage-1 / Stage-2 residuals as the analytical variance (the shared + `_exact_gmm_residuals` helper), NOT the ~1e-7 iterative residualized + outcome. Since bootstrap SEs override the analytical SE when + `n_bootstrap > 0`, a residual mismatch would silently report the pre-fix + ~1% approximate SE. White-box: the bootstrap influence variance + `bread @ (S'S) @ bread` must equal the analytical `overall_se**2` to + machine precision on an unbalanced untreated panel (where iterative and + exact residuals differ).""" + rng = np.random.default_rng(_BASE_SEED_VAR + 11) + panel = _make_staggered_panel( + rng, cohorts=[2, 4], n_per_cohort=50, n_periods=6, tau_constant=1.0, sigma=0.3 + ) + est = TwoStageDiD() + res = est.fit(panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat") + + # Reconstruct the Stage-1 internals the bootstrap score-builder consumes. + df = panel.copy() + omega_0_mask = ~((df["first_treat"] > 0) & (df["time"] >= df["first_treat"])) + unit_fe, time_fe, gm, delta_hat, kcm = est._fit_untreated_model( + df, "outcome", "unit", "time", None, omega_0_mask + ) + df["_y_tilde"] = est._residualize( + df, "outcome", "unit", "time", None, unit_fe, time_fe, gm, delta_hat + ) + treated = ((df["first_treat"] > 0) & (df["time"] >= df["first_treat"])).values + X_2 = treated.astype(float).reshape(-1, 1) + S, bread, _ = est._compute_cluster_S_scores( + df, + "unit", + "time", + None, + omega_0_mask, + unit_fe, + time_fe, + delta_hat, + kcm, + X_2, + df["unit"].values, + None, + ) + v_boot = bread @ (S.T @ S) @ bread + # Identical to the analytical exact SE (would be ~1% off with iterative residuals). + assert np.sqrt(v_boot[0, 0]) == pytest.approx(res.overall_se, abs=1e-9) + + def test_bootstrap_event_study_scores_use_exact_residuals(self) -> None: + """Multi-column (event-study) bootstrap GMM scores also use the shared + exact-residual helper (CI-review D1). White-box: for a multi-horizon + Stage-2 design, the bootstrap influence variance `bread @ (S'S) @ bread` + equals the analytical `_compute_gmm_variance` vcov **elementwise** to + machine precision — both are single-sourced through `_exact_gmm_residuals`, + so the k>1 path cannot silently diverge onto the iterative residuals.""" + rng = np.random.default_rng(_BASE_SEED_VAR + 12) + panel = _make_staggered_panel( + rng, cohorts=[2, 4], n_per_cohort=50, n_periods=6, tau_constant=1.0, sigma=0.3 + ) + est = TwoStageDiD() + est.fit(panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat") + + df = panel.copy() + omega_0_mask = ~((df["first_treat"] > 0) & (df["time"] >= df["first_treat"])) + unit_fe, time_fe, gm, delta_hat, kcm = est._fit_untreated_model( + df, "outcome", "unit", "time", None, omega_0_mask + ) + df["_y_tilde"] = est._residualize( + df, "outcome", "unit", "time", None, unit_fe, time_fe, gm, delta_hat + ) + # Multi-column Stage-2 design: indicators for event-time horizons 0 and 1. + treated = (df["first_treat"] > 0) & (df["time"] >= df["first_treat"]) + horizon = (df["time"] - df["first_treat"]).where(treated, other=-1).to_numpy() + X_2 = np.column_stack([(horizon == 0).astype(float), (horizon == 1).astype(float)]) + cluster_ids = df["unit"].to_numpy() + + v_analytical = est._compute_gmm_variance( + df, + "unit", + "time", + None, + omega_0_mask, + unit_fe, + time_fe, + delta_hat, + kcm, + X_2, + cluster_ids, + ) + S, bread, _ = est._compute_cluster_S_scores( + df, + "unit", + "time", + None, + omega_0_mask, + unit_fe, + time_fe, + delta_hat, + kcm, + X_2, + cluster_ids, + None, + ) + v_boot = bread @ (S.T @ S) @ bread + np.testing.assert_allclose(v_boot, v_analytical, atol=1e-9) + + +# Pin value for test_no_fsa_global_inverse_se_pin, produced by the global-inverse +# GMM sandwich code path (see the test docstring). Deterministic given the +# fixed-seed design (the SE computation itself has no randomness). +_SE_PIN = 0.03679572008170665 + + +# ============================================================================= +# Identification — always-treated, Proposition 5, edge cases +# ============================================================================= + + +class TestGardner2022Identification: + """fn. 19 (always-treated exclusion) + Proposition 5 of Borusyak et al. + (2024) (no never-treated -> horizons h >= h_bar unidentified) + REGISTRY + edge cases (balance_e, zero-obs cohorts).""" + + def test_always_treated_excluded_with_warning(self) -> None: + """Units treated in the first observed period have no untreated obs for + Stage 1; they are excluded with a warning listing the IDs, and their + treated obs do not enter Stage 2.""" + rng = np.random.default_rng(_BASE_SEED_IDENT + 1) + panel = _make_staggered_panel( + rng, cohorts=[3, 4], n_per_cohort=50, n_periods=6, tau_constant=1.0, sigma=0.1 + ) + n_treated_orig = ( + TwoStageDiD() + .fit(panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat") + .n_treated_units + ) + + # Force 5 units from a treated cohort to onset at period 1 (= min time), + # making them always-treated (no untreated obs for Stage 1). + cohort3_units = panel.loc[panel["first_treat"] == 3, "unit"].unique()[:5] + panel.loc[panel["unit"].isin(cohort3_units), "first_treat"] = 1 + + with pytest.warns(UserWarning, match="treated in all observed periods"): + res = TwoStageDiD().fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + # The 5 always-treated units are dropped, so the treated-unit count falls + # by exactly 5, and their obs do not enter Stage 2. + assert res.n_treated_units == n_treated_orig - 5 + assert np.isfinite(res.overall_att) + + def test_no_never_treated_horizons_nan(self) -> None: + """Proposition 5 (Borusyak et al. 2024): with no never-treated units and + h_bar = max(groups) - min(groups), horizons h >= h_bar are not identified + -> NaN effect with n_obs > 0 and a warning. Regression-pins the behavior + implemented at two_stage.py:2531-2674 (mirror of ImputationDiD).""" + rng = np.random.default_rng(_BASE_SEED_IDENT + 2) + # Cohorts 3 and 5, NO never-treated => h_bar = 5 - 3 = 2. + panel = _make_staggered_panel( + rng, + cohorts=[3, 5], + n_per_cohort=80, + n_periods=8, + tau_constant=1.0, + sigma=0.1, + include_never_treated=False, + ) + with pytest.warns(UserWarning, match="identified"): + res = TwoStageDiD().fit( + panel, + outcome="outcome", + unit="unit", + time="time", + first_treat="first_treat", + aggregate="event_study", + ) + assert res.event_study_effects is not None + h_bar = 2 + flagged = [ + h + for h, eff in res.event_study_effects.items() + if h >= h_bar and np.isnan(eff["effect"]) + ] + assert flagged, "expected NaN horizons at or above h_bar" + for h in flagged: + assert res.event_study_effects[h]["n_obs"] > 0, f"h={h} should keep n_obs>0" + + def test_balance_e_no_qualifying_cohorts_warns(self) -> None: + """balance_e larger than any cohort's available pre-window -> no cohort + qualifies -> warning + event study contains only the reference period.""" + rng = np.random.default_rng(_BASE_SEED_IDENT + 3) + panel = _make_staggered_panel( + rng, cohorts=[2, 3], n_per_cohort=60, n_periods=5, tau_constant=1.0, sigma=0.1 + ) + with pytest.warns(UserWarning, match="balance_e"): + res = TwoStageDiD().fit( + panel, + outcome="outcome", + unit="unit", + time="time", + first_treat="first_treat", + aggregate="event_study", + balance_e=10, # impossible pre-window + ) + assert res.event_study_effects is not None + # Only the reference period survives. + finite_effects = [ + h for h, eff in res.event_study_effects.items() if eff.get("n_obs", 0) > 0 + ] + assert finite_effects == [] or all( + res.event_study_effects[h]["effect"] == 0.0 for h in res.event_study_effects + ) + + def test_zero_obs_cohort_nan(self) -> None: + """A cohort whose treated obs ALL fall at an unidentified period -> its + group effect is NaN with n_obs=0 (REGISTRY edge case). Constructed: period + 4 is observed only among treated obs (never-treated absent at t=4), so its + time FE is unidentified in Ω₀ and every t=4 outcome has NaN y_tilde. Cohort + 4 is treated only at t=4 -> all its treated obs are NaN -> n_obs=0; cohort 3 + keeps its identified t=3 effect (n_obs > 0).""" + rng = np.random.default_rng(_BASE_SEED_IDENT + 4) + rows: List[Dict[str, Any]] = [] + uid = 0 + # Never-treated: observed t=1,2,3 only (absent at t=4 -> t=4 has no + # untreated obs anywhere -> its time FE is unidentified). + for _ in range(40): + c_i = rng.standard_normal() + for t in (1, 2, 3): + rows.append( + { + "unit": uid, + "time": t, + "first_treat": 0, + "outcome": c_i + 0.5 * t + 0.1 * rng.standard_normal(), + } + ) + uid += 1 + # Cohort 3 (treated t>=3) and cohort 4 (treated only at t=4), t=1..4. + for g in (3, 4): + for _ in range(40): + c_i = rng.standard_normal() + for t in (1, 2, 3, 4): + treated = t >= g + y = c_i + 0.5 * t + (1.0 if treated else 0.0) + 0.1 * rng.standard_normal() + rows.append({"unit": uid, "time": t, "first_treat": g, "outcome": y}) + uid += 1 + panel = pd.DataFrame(rows) + + with warnings.catch_warnings(): + warnings.simplefilter("ignore") # rank / NaN-y_tilde / lstsq warnings expected + res = TwoStageDiD(rank_deficient_action="silent").fit( + panel, + outcome="outcome", + unit="unit", + time="time", + first_treat="first_treat", + aggregate="group", + ) + assert res.group_effects is not None + # Cohort 4's only treated obs is the unidentified t=4 -> NaN, n_obs=0. + assert 4 in res.group_effects + c4 = res.group_effects[4] + assert c4["n_obs"] == 0, f"cohort 4 should have n_obs=0, got {c4['n_obs']}" + assert np.isnan(c4["effect"]), "cohort 4 effect must be NaN" + assert np.isnan(c4["se"]) and np.isnan(c4["t_stat"]) and np.isnan(c4["p_value"]) + # Cohort 3 keeps an identified t=3 effect. + assert 3 in res.group_effects and res.group_effects[3]["n_obs"] > 0 + + +# ============================================================================= +# Library extensions / deviations +# ============================================================================= + + +class TestGardner2022LibraryDeviations: + """Gardner (2022) prescribes analytical GMM SEs only (§3.3 + Appendix B). + The library adds a multiplier bootstrap and narrows vcov_type to the GMM + sandwich -- both documented in REGISTRY ## TwoStageDiD.""" + + def test_multiplier_bootstrap_is_library_extension(self, ci_params) -> None: + """The multiplier bootstrap on the GMM influence function runs and yields + a finite bootstrap SE; it is a library extension (the paper proposes no + bootstrap; R did2s defaults to analytical SEs).""" + rng = np.random.default_rng(_BASE_SEED_DEV + 1) + panel = _make_staggered_panel( + rng, cohorts=[3, 4], n_per_cohort=60, tau_constant=1.0, sigma=0.2 + ) + n_boot = ci_params.bootstrap(99) + res = TwoStageDiD(n_bootstrap=n_boot, seed=123).fit( + panel, outcome="outcome", unit="unit", time="time", first_treat="first_treat" + ) + assert res.bootstrap_results is not None + assert np.isfinite(res.bootstrap_results.overall_att_se) + assert res.bootstrap_results.overall_att_se > 0 + + def test_vcov_type_narrow_to_hc1(self) -> None: + """vcov_type is permanently narrow to the GMM sandwich: classical/hc2/ + hc2_bm are rejected at construction (the GMM-corrected meat has no single + hat matrix spanning both stages).""" + for bad in ("classical", "hc2", "hc2_bm"): + with pytest.raises(ValueError, match="rejected"): + TwoStageDiD(vcov_type=bad) + # conley is deferred (separate error), and an unknown type is invalid. + with pytest.raises(ValueError): + TwoStageDiD(vcov_type="conley") + with pytest.raises(ValueError): + TwoStageDiD(vcov_type="bogus") + + +# ============================================================================= +# R parity — did2s::did2s() (skip-guarded golden) +# ============================================================================= + + +@pytest.fixture(scope="module") +def golden() -> dict: + if not _R_FIXTURE_AVAILABLE: + pytest.skip( + "R did2s parity fixture not present. Run " + "`Rscript benchmarks/R/generate_did2s_golden.R` to regenerate " + "`benchmarks/data/did2s_golden.json`." + ) + with GOLDEN_PATH.open("r") as f: + return json.loads(f.read()) + + +@pytest.fixture(scope="module") +def panel() -> pd.DataFrame: + if not _R_FIXTURE_AVAILABLE: + pytest.skip("R did2s parity fixture not present.") + return pd.read_csv(PANEL_PATH) + + +class TestTwoStageDiDParityR: + """Pin Python TwoStageDiD output against R ``did2s::did2s()`` (analytical + corrected clustered SEs, bootstrap = FALSE, clustered at unit).""" + + def test_overall_att_matches_r(self, golden: dict, panel: pd.DataFrame) -> None: + res = TwoStageDiD().fit( + panel, outcome="y", unit="unit", time="time", first_treat="first_treat" + ) + assert res.overall_att == pytest.approx(golden["overall"]["att"], abs=1e-6) + + def test_overall_se_matches_r(self, golden: dict, panel: pd.DataFrame) -> None: + res = TwoStageDiD().fit( + panel, outcome="y", unit="unit", time="time", first_treat="first_treat" + ) + assert res.overall_se == pytest.approx(golden["overall"]["se"], abs=1e-7) + + def test_event_study_atts_match_r(self, golden: dict, panel: pd.DataFrame) -> None: + res = TwoStageDiD().fit( + panel, + outcome="y", + unit="unit", + time="time", + first_treat="first_treat", + aggregate="event_study", + ) + assert res.event_study_effects is not None + es = golden["event_study"] + assert len(es["horizons"]) > 0 + for h, att in zip(es["horizons"], es["att"]): + assert h in res.event_study_effects, f"missing horizon {h}" + got = res.event_study_effects[h]["effect"] + assert np.isfinite(got), f"non-finite ATT at h={h}" + assert got == pytest.approx(att, abs=1e-6), f"h={h}" + + def test_event_study_ses_match_r(self, golden: dict, panel: pd.DataFrame) -> None: + res = TwoStageDiD().fit( + panel, + outcome="y", + unit="unit", + time="time", + first_treat="first_treat", + aggregate="event_study", + ) + assert res.event_study_effects is not None + es = golden["event_study"] + assert len(es["horizons"]) > 0 + for h, se in zip(es["horizons"], es["se"]): + assert h in res.event_study_effects, f"missing horizon {h}" + got = res.event_study_effects[h]["se"] + assert np.isfinite(got), f"non-finite SE at h={h}" + assert got == pytest.approx(se, abs=1e-7), f"h={h}"