Add synthetic control tutorial + generate_synthetic_control_data generator by igerber · Pull Request #540 · igerber/diff-diff

igerber · 2026-06-22T17:56:59Z

Summary

Add generate_synthetic_control_data() — a public single-treated-unit factor-model data generator in diff_diff/prep_dgp.py (exported from diff_diff). One treated unit whose factor loadings + baseline are an exact convex combination of a few donors (so it lies inside the donor convex hull — a good synthetic control provably exists), persistent AR(1) factors, predictor covariates that each proxy a distinct factor, a common time effect, and a known "ramp"/"constant" effect emitted as true_effect.
Add capstone tutorial docs/tutorials/25_synthetic_control_policy.ipynb walking the full SyntheticControl surface end-to-end on a policy-evaluation story (one state adopts a clean-energy standard), structured around two inference philosophies: cross-unit permutation (in_space_placebo + Firpo–Possebom confidence_set, with leave_one_out/in_time_placebo robustness) vs over-time conformal (CWZ conformal_test/conformal_confidence_intervals/conformal_average_effect), with the per-period conformal band as the climax.
Tests + docs: TestGenerateSyntheticControlData unit tests, a test_t25_*_drift.py guard that re-derives every quoted number from the generator, prep.rst autofunction + example, index.rst toctree, doc-deps.yaml tutorial entries, CHANGELOG, and the llms-full generator catalog.

Methodology references

Method name(s): tutorial showcases the existing SyntheticControl estimator + its inference layers; the new code is a synthetic data generator (a factor-model DGP for demos/tests), not a new estimator or a change to any estimator's math.
Paper / source link(s): Abadie, Diamond & Hainmueller (2010, 2015); Firpo & Possebom (2018); Chernozhukov, Wüthrich & Zhu (2021). All already in docs/references.rst.
Any intentional deviations from the source (and why): None. No estimator math changed; no REGISTRY methodology change.

Validation

Tests added/updated: tests/test_prep.py (TestGenerateSyntheticControlData, 8 tests), tests/test_t25_synthetic_control_policy_drift.py (14 tests re-deriving every notebook number).
Backtest / simulation / notebook evidence: notebook executes under DIFF_DIFF_BACKEND=python pytest --nbmake in ~58s (well under the 600s CI budget) with 6 figures; recovers the injected effect (ATT 6.79 vs true mean 7.0), placebo p=0.048, CWZ pointwise band brackets the truth 5/5, average-effect CI [6.5, 7.5].

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

github-actions · 2026-06-22T17:59:29Z

Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

No estimator math, weighting, variance/SE, identification checks, or default estimator behavior appears changed.
The tutorial’s SCM, Firpo–Possebom, and CWZ conformal descriptions align with docs/methodology/REGISTRY.md:L1997-L2081.
Analytical SCM inference remains NaN as required by the registry and implementation contract.
One P2 documentation/methodology overstatement: the generator’s “provably exists / in convex hull” wording is stronger than what the noisy observed data actually guarantee.
I could not run tests locally because the environment is missing numpy.

Methodology

Finding: Observed convex-hull guarantee is overstated
Severity: P2
Impact: generate_synthetic_control_data() makes the treated unit’s latent loadings and baseline an exact convex combination of donors, but then adds independent predictor noise and independent outcome noise. ADH’s convex-hull fit condition concerns observed pre-period outcomes and predictors, not only latent loadings. The docs/tutorial wording says a good synthetic control “provably exists,” which is too strong under default nonzero predictor_noise_sd and noise_sd. See diff_diff/prep_dgp.py:L491-L507, diff_diff/prep_dgp.py:L641-L664, and the ADH registry/source condition in docs/methodology/papers/abadie-diamond-hainmueller-2010-review.md:L55-L69.
Concrete fix: Rephrase as “the noiseless latent counterfactual signal is in the donor convex hull; observed fit is approximate because transitory/predictor noise is added,” or make treated predictor/no-intervention shocks the same donor convex combination when exact observed in-hull behavior is intended.

Code Quality

No findings. The new generator is scoped, exported consistently through prep.py and __init__.py, and validates the key shape/timing arguments.

Performance

No findings. The new drift test is relatively heavy, but it uses a module fixture and appears intentionally scoped to the tutorial’s quoted numerics.

Maintainability

No findings. The docs dependency map adds the tutorial to the affected SyntheticControl/conformal surfaces, which helps future drift tracking.

Tech Debt

No findings requiring TODO tracking. The only issue above is a documentation/contract precision issue, not silent statistical output.

Security

No findings. I did not see secrets or sensitive material introduced in the changed surfaces.

Documentation/Tests

Finding: Tests do not pin the exact-hull wording
Severity: P3
Impact: test_treated_in_hull_recovers_effect only checks loose recovery on noisy data, so it would not catch the documentation overclaim above. See tests/test_prep.py:L1133-L1163.
Concrete fix: If keeping exact-hull language, add a noiseless test (noise_sd=0, predictor_noise_sd=0) that demonstrates near-zero pre-RMSPE, or update the wording and test name to say “latent/noiseless signal is in hull” rather than observed data.

igerber · 2026-06-23T12:40:09Z

/ai-review

github-actions · 2026-06-23T12:42:50Z

🔁 AI review rerun (requested by @igerber)

Head SHA: fe44783a8816a3347e657bbd6ec83fb9b578977c

Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

No estimator math, weighting, variance/SE, identification assumptions, or default estimator behavior changed.
The previous convex-hull overstatement is materially resolved in the docstring, changelog, API docs, tutorial prose, and tests.
Tutorial methodology aligns with docs/methodology/REGISTRY.md:L1997-L2081 for ADH SCM, Firpo–Possebom test inversion, and CWZ conformal inference.
Analytical SCM inference remains NaN and is explicitly drift-tested.
I could not run pytest: this environment lacks both pytest and numpy.

Methodology

No P0/P1 findings.

Finding: Residual shorthand in LLM guide omits the noiseless qualifier
Severity: P3
Impact: diff_diff/guides/llms-full.txt:L1990-L1991 says “treated unit in the donor convex hull” while the generator only guarantees the noiseless latent/outcome path is in the donor hull under the generating convex weights; observed noisy data are approximate. The main public docs correctly state this at diff_diff/prep_dgp.py:L503-L509, docs/api/prep.rst:L134-L138, and notebook prose lines around docs/tutorials/25_synthetic_control_policy.ipynb:L42-L46.
Concrete fix: Optionally change the LLM-guide comment to “noiseless treated path in the donor convex hull.”

Code Quality

No findings. The generator is scoped and validates key dimensional/timing arguments at diff_diff/prep_dgp.py:L593-L608. Exports are wired through diff_diff/prep.py:L20-L31 and diff_diff/__init__.py:L126-L136,L418-L430.

Performance

No findings. The tutorial drift test is intentionally heavier but uses a module-scoped fixture to avoid repeated full SCM/conformal runs: tests/test_t25_synthetic_control_policy_drift.py:L45-L94.

Maintainability

No findings. The drift test pins notebook kwargs and quoted values, reducing prose/numeric drift risk: tests/test_t25_synthetic_control_policy_drift.py:L210-L244.

Tech Debt

No findings requiring TODO tracking. Existing SCM deferred work remains in TODO.md and is unrelated to this PR’s generator/tutorial changes.

Security

No findings. I did not see secrets or sensitive material introduced in the changed surfaces.

Documentation/Tests

No blocking findings. The prior test gap is addressed by test_noiseless_outcome_path_in_hull at tests/test_prep.py:L1167-L1195, and analytical NaN inference is checked at tests/test_t25_synthetic_control_policy_drift.py:L144-L151.

Verification limitation: git diff --check passed, but targeted pytest execution could not run because pytest is not installed; a lightweight import smoke check also failed because numpy is unavailable.

Add a public single-treated-unit factor-model data generator (generate_synthetic_control_data in prep_dgp.py, exported from diff_diff) and a capstone SyntheticControl tutorial (docs/tutorials/25_synthetic_control_policy.ipynb) showcasing the full estimator surface and the two inference philosophies (cross-unit permutation vs CWZ over-time conformal), with the per-period conformal band as the climax. The generator builds a treated unit whose latent loadings/baseline are an exact convex combination of donors, so its NOISELESS trajectory lies in the donor convex hull (the observed fit is approximate under added transitory/ predictor noise). Includes TestGenerateSyntheticControlData + test_noiseless_outcome_path_in_hull unit tests, a t25 drift guard re-deriving every quoted number from the generator, and doc surfaces (api/prep.rst, index toctree, doc-deps.yaml, CHANGELOG, llms-full). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

igerber · 2026-06-23T12:54:00Z

/ai-review

github-actions · 2026-06-23T13:01:23Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 2477a6b8b2b4f6cc5a6203bae5e63c70a0eb343d

Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

No estimator math, weighting, variance/SE, identification assumptions, or estimator defaults changed; the PR adds a SCM DGP, tutorial, docs, exports, and tests.
SyntheticControl tutorial claims align with docs/methodology/REGISTRY.md:L2004-L2088, including always-NaN analytical inference, in-space placebo, Firpo–Possebom sets, ADH robustness checks, and CWZ conformal inference.
The prior convex-hull wording issue is resolved in the LLM guide: diff_diff/guides/llms-full.txt:L1990-L1993.
Generator tests now cover structure, invalid args, noisy recovery, and the noiseless in-hull property: tests/test_prep.py:L1043-L1216.
Runtime tests could not be run here: pytest, numpy, and pandas are not installed. git diff --check passed, and the notebook JSON parsed.

Methodology

Finding: None
Severity: None
Impact: The affected method surface is existing SyntheticControl; this PR does not change estimator behavior. The tutorial’s no-analytical-SE statement matches diff_diff/synthetic_control.py:L22-L31 and the implementation’s safe_inference(att, np.nan, ...) path at diff_diff/synthetic_control.py:L611-L648.
Concrete fix: None required.

Finding: None
Severity: None
Impact: The CWZ tutorial prose matches the documented conformal contract: outcomes-only constrained-LS proxy over all periods under the null, no ADH V matrix, residual permutation over time. See diff_diff/conformal.py:L11-L28 and docs/tutorials/25_synthetic_control_policy.ipynb:L786-L796.
Concrete fix: None required.

Code Quality

Finding: None
Severity: None
Impact: generate_synthetic_control_data() validates core dimensions/timing arguments and implements the intended single-treated-unit donor-panel DGP cleanly: diff_diff/prep_dgp.py:L469-L683. Exports are wired through diff_diff/prep.py:L18-L31 and diff_diff/__init__.py:L124-L136,L418-L430.
Concrete fix: None required.

Performance

Finding: None
Severity: None
Impact: The heavier tutorial drift checks use a module-scoped fixture so the full SCM/conformal surface is run once: tests/test_t25_synthetic_control_policy_drift.py:L45-L94.
Concrete fix: None required.

Maintainability

Finding: None
Severity: None
Impact: The tutorial is connected to doc drift tracking for SyntheticControl, conformal, and prep DGP surfaces: docs/doc-deps.yaml:L640-L675 and docs/doc-deps.yaml:L917-L927.
Concrete fix: None required.

Tech Debt

Finding: None
Severity: None
Impact: Existing SyntheticControl deferred work remains tracked in TODO.md:L89-L90 and is not expanded by this PR.
Concrete fix: None required.

Security

Finding: None
Severity: None
Impact: No secrets or sensitive material were found in the changed surfaces.
Concrete fix: None required.

Documentation/Tests

Finding: None
Severity: None
Impact: Public API docs and changelog describe the new generator with the noiseless/observed-noise distinction: docs/api/prep.rst:L131-L160, CHANGELOG.md:L8-L14. The drift test pins quoted tutorial numbers, analytical NaN inference, conformal p-values/CIs, and notebook kwargs: tests/test_t25_synthetic_control_policy_drift.py:L144-L151,L185-L207,L210-L244.
Concrete fix: None required.

github-actions · 2026-06-23T13:10:44Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 2477a6b8b2b4f6cc5a6203bae5e63c70a0eb343d

Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

This PR adds a SCM data generator, tutorial, docs, exports, and drift tests; it does not change estimator math, weighting, SE/variance, identification assumptions, or defaults.
Tutorial methodology matches the registry for SyntheticControl: analytical se/t_stat/p_value/conf_int stay NaN, with placebo/Firpo/CWZ inference surfaced separately (docs/methodology/REGISTRY.md:L2004-L2088).
The generator’s noiseless in-hull claim is now scoped to the counterfactual/noisy distinction and is tested (diff_diff/prep_dgp.py:L503-L512, tests/test_prep.py:L1167-L1195).
No new inline inference anti-patterns or partial NaN guards were found in the changed files.
Static checks passed: git diff --check and notebook JSON parsing. Runtime tests could not be run because pytest and numpy are not installed.

Methodology

Finding: None
Severity: None
Impact: The affected method surface is existing SyntheticControl; the PR only adds a DGP/tutorial. The tutorial’s in-space placebo, Firpo-Possebom confidence set, ADH robustness, and CWZ conformal descriptions align with the registry and implementation contracts (diff_diff/synthetic_control.py:L22-L31, diff_diff/synthetic_control.py:L611-L648, diff_diff/conformal.py:L21-L28).
Concrete fix: None required.

Code Quality

Finding: None
Severity: None
Impact: generate_synthetic_control_data() validates core dimensions/timing, constructs one treated unit plus donors, and cleanly emits treatment, ever-treated, true-effect, and predictor columns (diff_diff/prep_dgp.py:L593-L683). Public exports are wired through prep and package __all__ (diff_diff/prep.py:L18-L31, diff_diff/__init__.py:L120-L136, diff_diff/__init__.py:L419-L430).
Concrete fix: None required.

Performance

Finding: None
Severity: None
Impact: The heavier tutorial drift checks use a module-scoped fixture so the SCM/conformal workflow runs once (tests/test_t25_synthetic_control_policy_drift.py:L45-L94).
Concrete fix: None required.

Maintainability

Finding: None
Severity: None
Impact: Documentation drift tracking links the new tutorial to SyntheticControl, SyntheticControlResults, conformal.py, and prep_dgp.py (docs/doc-deps.yaml:L646-L675, docs/doc-deps.yaml:L917-L927).
Concrete fix: None required.

Tech Debt

Finding: None
Severity: None
Impact: Existing deferred SyntheticControl items remain tracked in TODO.md:L89-L90; this PR does not introduce untracked correctness debt.
Concrete fix: None required.

Security

Finding: None
Severity: None
Impact: No secrets, credentials, or sensitive data were identified in the changed code/docs/tutorial surfaces.
Concrete fix: None required.

Documentation/Tests

Finding: None
Severity: None
Impact: API docs, tutorial toctree, LLM guide, and drift tests are updated for the new generator/tutorial (docs/api/prep.rst:L131-L161, docs/index.rst:L100-L112, diff_diff/guides/llms-full.txt:L1990-L1993, tests/test_t25_synthetic_control_policy_drift.py:L97-L244). Static validation passed; runtime tests were not executable here due missing pytest/numpy.
Concrete fix: None required.

igerber force-pushed the docs/scm-tutorial branch from 3b18921 to 2477a6b Compare June 23, 2026 12:53

igerber added the ready-for-ci Triggers CI test workflows label Jun 23, 2026

igerber merged commit 531db01 into main Jun 23, 2026
33 of 35 checks passed

igerber deleted the docs/scm-tutorial branch June 23, 2026 14:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add synthetic control tutorial + generate_synthetic_control_data generator#540

Add synthetic control tutorial + generate_synthetic_control_data generator#540
igerber merged 1 commit into
mainfrom
docs/scm-tutorial

igerber commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

igerber commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

igerber commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Jun 22, 2026

Summary

Methodology references

Validation

Security / privacy

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

igerber commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

igerber commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant