Skip to content

Add synthetic control tutorial + generate_synthetic_control_data generator#540

Merged
igerber merged 1 commit into
mainfrom
docs/scm-tutorial
Jun 23, 2026
Merged

Add synthetic control tutorial + generate_synthetic_control_data generator#540
igerber merged 1 commit into
mainfrom
docs/scm-tutorial

Conversation

@igerber

@igerber igerber commented Jun 22, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add generate_synthetic_control_data() — a public single-treated-unit factor-model data generator in diff_diff/prep_dgp.py (exported from diff_diff). One treated unit whose factor loadings + baseline are an exact convex combination of a few donors (so it lies inside the donor convex hull — a good synthetic control provably exists), persistent AR(1) factors, predictor covariates that each proxy a distinct factor, a common time effect, and a known "ramp"/"constant" effect emitted as true_effect.
  • Add capstone tutorial docs/tutorials/25_synthetic_control_policy.ipynb walking the full SyntheticControl surface end-to-end on a policy-evaluation story (one state adopts a clean-energy standard), structured around two inference philosophies: cross-unit permutation (in_space_placebo + Firpo–Possebom confidence_set, with leave_one_out/in_time_placebo robustness) vs over-time conformal (CWZ conformal_test/conformal_confidence_intervals/conformal_average_effect), with the per-period conformal band as the climax.
  • Tests + docs: TestGenerateSyntheticControlData unit tests, a test_t25_*_drift.py guard that re-derives every quoted number from the generator, prep.rst autofunction + example, index.rst toctree, doc-deps.yaml tutorial entries, CHANGELOG, and the llms-full generator catalog.

Methodology references

  • Method name(s): tutorial showcases the existing SyntheticControl estimator + its inference layers; the new code is a synthetic data generator (a factor-model DGP for demos/tests), not a new estimator or a change to any estimator's math.
  • Paper / source link(s): Abadie, Diamond & Hainmueller (2010, 2015); Firpo & Possebom (2018); Chernozhukov, Wüthrich & Zhu (2021). All already in docs/references.rst.
  • Any intentional deviations from the source (and why): None. No estimator math changed; no REGISTRY methodology change.

Validation

  • Tests added/updated: tests/test_prep.py (TestGenerateSyntheticControlData, 8 tests), tests/test_t25_synthetic_control_policy_drift.py (14 tests re-deriving every notebook number).
  • Backtest / simulation / notebook evidence: notebook executes under DIFF_DIFF_BACKEND=python pytest --nbmake in ~58s (well under the 600s CI budget) with 6 figures; recovers the injected effect (ATT 6.79 vs true mean 7.0), placebo p=0.048, CWZ pointwise band brackets the truth 5/5, average-effect CI [6.5, 7.5].

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

@github-actions

Copy link
Copy Markdown

Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

  • No estimator math, weighting, variance/SE, identification checks, or default estimator behavior appears changed.
  • The tutorial’s SCM, Firpo–Possebom, and CWZ conformal descriptions align with docs/methodology/REGISTRY.md:L1997-L2081.
  • Analytical SCM inference remains NaN as required by the registry and implementation contract.
  • One P2 documentation/methodology overstatement: the generator’s “provably exists / in convex hull” wording is stronger than what the noisy observed data actually guarantee.
  • I could not run tests locally because the environment is missing numpy.

Methodology

Finding: Observed convex-hull guarantee is overstated
Severity: P2
Impact: generate_synthetic_control_data() makes the treated unit’s latent loadings and baseline an exact convex combination of donors, but then adds independent predictor noise and independent outcome noise. ADH’s convex-hull fit condition concerns observed pre-period outcomes and predictors, not only latent loadings. The docs/tutorial wording says a good synthetic control “provably exists,” which is too strong under default nonzero predictor_noise_sd and noise_sd. See diff_diff/prep_dgp.py:L491-L507, diff_diff/prep_dgp.py:L641-L664, and the ADH registry/source condition in docs/methodology/papers/abadie-diamond-hainmueller-2010-review.md:L55-L69.
Concrete fix: Rephrase as “the noiseless latent counterfactual signal is in the donor convex hull; observed fit is approximate because transitory/predictor noise is added,” or make treated predictor/no-intervention shocks the same donor convex combination when exact observed in-hull behavior is intended.

Code Quality

No findings. The new generator is scoped, exported consistently through prep.py and __init__.py, and validates the key shape/timing arguments.

Performance

No findings. The new drift test is relatively heavy, but it uses a module fixture and appears intentionally scoped to the tutorial’s quoted numerics.

Maintainability

No findings. The docs dependency map adds the tutorial to the affected SyntheticControl/conformal surfaces, which helps future drift tracking.

Tech Debt

No findings requiring TODO tracking. The only issue above is a documentation/contract precision issue, not silent statistical output.

Security

No findings. I did not see secrets or sensitive material introduced in the changed surfaces.

Documentation/Tests

Finding: Tests do not pin the exact-hull wording
Severity: P3
Impact: test_treated_in_hull_recovers_effect only checks loose recovery on noisy data, so it would not catch the documentation overclaim above. See tests/test_prep.py:L1133-L1163.
Concrete fix: If keeping exact-hull language, add a noiseless test (noise_sd=0, predictor_noise_sd=0) that demonstrates near-zero pre-RMSPE, or update the wording and test name to say “latent/noiseless signal is in hull” rather than observed data.

@igerber

igerber commented Jun 23, 2026

Copy link
Copy Markdown
Owner Author

/ai-review

@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: fe44783a8816a3347e657bbd6ec83fb9b578977c


Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

  • No estimator math, weighting, variance/SE, identification assumptions, or default estimator behavior changed.
  • The previous convex-hull overstatement is materially resolved in the docstring, changelog, API docs, tutorial prose, and tests.
  • Tutorial methodology aligns with docs/methodology/REGISTRY.md:L1997-L2081 for ADH SCM, Firpo–Possebom test inversion, and CWZ conformal inference.
  • Analytical SCM inference remains NaN and is explicitly drift-tested.
  • I could not run pytest: this environment lacks both pytest and numpy.

Methodology

No P0/P1 findings.

Finding: Residual shorthand in LLM guide omits the noiseless qualifier
Severity: P3
Impact: diff_diff/guides/llms-full.txt:L1990-L1991 says “treated unit in the donor convex hull” while the generator only guarantees the noiseless latent/outcome path is in the donor hull under the generating convex weights; observed noisy data are approximate. The main public docs correctly state this at diff_diff/prep_dgp.py:L503-L509, docs/api/prep.rst:L134-L138, and notebook prose lines around docs/tutorials/25_synthetic_control_policy.ipynb:L42-L46.
Concrete fix: Optionally change the LLM-guide comment to “noiseless treated path in the donor convex hull.”

Code Quality

No findings. The generator is scoped and validates key dimensional/timing arguments at diff_diff/prep_dgp.py:L593-L608. Exports are wired through diff_diff/prep.py:L20-L31 and diff_diff/__init__.py:L126-L136,L418-L430.

Performance

No findings. The tutorial drift test is intentionally heavier but uses a module-scoped fixture to avoid repeated full SCM/conformal runs: tests/test_t25_synthetic_control_policy_drift.py:L45-L94.

Maintainability

No findings. The drift test pins notebook kwargs and quoted values, reducing prose/numeric drift risk: tests/test_t25_synthetic_control_policy_drift.py:L210-L244.

Tech Debt

No findings requiring TODO tracking. Existing SCM deferred work remains in TODO.md and is unrelated to this PR’s generator/tutorial changes.

Security

No findings. I did not see secrets or sensitive material introduced in the changed surfaces.

Documentation/Tests

No blocking findings. The prior test gap is addressed by test_noiseless_outcome_path_in_hull at tests/test_prep.py:L1167-L1195, and analytical NaN inference is checked at tests/test_t25_synthetic_control_policy_drift.py:L144-L151.

Verification limitation: git diff --check passed, but targeted pytest execution could not run because pytest is not installed; a lightweight import smoke check also failed because numpy is unavailable.

Add a public single-treated-unit factor-model data generator
(generate_synthetic_control_data in prep_dgp.py, exported from diff_diff)
and a capstone SyntheticControl tutorial
(docs/tutorials/25_synthetic_control_policy.ipynb) showcasing the full
estimator surface and the two inference philosophies (cross-unit
permutation vs CWZ over-time conformal), with the per-period conformal
band as the climax.

The generator builds a treated unit whose latent loadings/baseline are an
exact convex combination of donors, so its NOISELESS trajectory lies in the
donor convex hull (the observed fit is approximate under added transitory/
predictor noise). Includes TestGenerateSyntheticControlData +
test_noiseless_outcome_path_in_hull unit tests, a t25 drift guard
re-deriving every quoted number from the generator, and doc surfaces
(api/prep.rst, index toctree, doc-deps.yaml, CHANGELOG, llms-full).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the docs/scm-tutorial branch from 3b18921 to 2477a6b Compare June 23, 2026 12:53
@igerber

igerber commented Jun 23, 2026

Copy link
Copy Markdown
Owner Author

/ai-review

@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 2477a6b8b2b4f6cc5a6203bae5e63c70a0eb343d


Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

  • No estimator math, weighting, variance/SE, identification assumptions, or estimator defaults changed; the PR adds a SCM DGP, tutorial, docs, exports, and tests.
  • SyntheticControl tutorial claims align with docs/methodology/REGISTRY.md:L2004-L2088, including always-NaN analytical inference, in-space placebo, Firpo–Possebom sets, ADH robustness checks, and CWZ conformal inference.
  • The prior convex-hull wording issue is resolved in the LLM guide: diff_diff/guides/llms-full.txt:L1990-L1993.
  • Generator tests now cover structure, invalid args, noisy recovery, and the noiseless in-hull property: tests/test_prep.py:L1043-L1216.
  • Runtime tests could not be run here: pytest, numpy, and pandas are not installed. git diff --check passed, and the notebook JSON parsed.

Methodology

Finding: None
Severity: None
Impact: The affected method surface is existing SyntheticControl; this PR does not change estimator behavior. The tutorial’s no-analytical-SE statement matches diff_diff/synthetic_control.py:L22-L31 and the implementation’s safe_inference(att, np.nan, ...) path at diff_diff/synthetic_control.py:L611-L648.
Concrete fix: None required.

Finding: None
Severity: None
Impact: The CWZ tutorial prose matches the documented conformal contract: outcomes-only constrained-LS proxy over all periods under the null, no ADH V matrix, residual permutation over time. See diff_diff/conformal.py:L11-L28 and docs/tutorials/25_synthetic_control_policy.ipynb:L786-L796.
Concrete fix: None required.

Code Quality

Finding: None
Severity: None
Impact: generate_synthetic_control_data() validates core dimensions/timing arguments and implements the intended single-treated-unit donor-panel DGP cleanly: diff_diff/prep_dgp.py:L469-L683. Exports are wired through diff_diff/prep.py:L18-L31 and diff_diff/__init__.py:L124-L136,L418-L430.
Concrete fix: None required.

Performance

Finding: None
Severity: None
Impact: The heavier tutorial drift checks use a module-scoped fixture so the full SCM/conformal surface is run once: tests/test_t25_synthetic_control_policy_drift.py:L45-L94.
Concrete fix: None required.

Maintainability

Finding: None
Severity: None
Impact: The tutorial is connected to doc drift tracking for SyntheticControl, conformal, and prep DGP surfaces: docs/doc-deps.yaml:L640-L675 and docs/doc-deps.yaml:L917-L927.
Concrete fix: None required.

Tech Debt

Finding: None
Severity: None
Impact: Existing SyntheticControl deferred work remains tracked in TODO.md:L89-L90 and is not expanded by this PR.
Concrete fix: None required.

Security

Finding: None
Severity: None
Impact: No secrets or sensitive material were found in the changed surfaces.
Concrete fix: None required.

Documentation/Tests

Finding: None
Severity: None
Impact: Public API docs and changelog describe the new generator with the noiseless/observed-noise distinction: docs/api/prep.rst:L131-L160, CHANGELOG.md:L8-L14. The drift test pins quoted tutorial numbers, analytical NaN inference, conformal p-values/CIs, and notebook kwargs: tests/test_t25_synthetic_control_policy_drift.py:L144-L151,L185-L207,L210-L244.
Concrete fix: None required.

@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 2477a6b8b2b4f6cc5a6203bae5e63c70a0eb343d


Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

  • This PR adds a SCM data generator, tutorial, docs, exports, and drift tests; it does not change estimator math, weighting, SE/variance, identification assumptions, or defaults.
  • Tutorial methodology matches the registry for SyntheticControl: analytical se/t_stat/p_value/conf_int stay NaN, with placebo/Firpo/CWZ inference surfaced separately (docs/methodology/REGISTRY.md:L2004-L2088).
  • The generator’s noiseless in-hull claim is now scoped to the counterfactual/noisy distinction and is tested (diff_diff/prep_dgp.py:L503-L512, tests/test_prep.py:L1167-L1195).
  • No new inline inference anti-patterns or partial NaN guards were found in the changed files.
  • Static checks passed: git diff --check and notebook JSON parsing. Runtime tests could not be run because pytest and numpy are not installed.

Methodology

  • Finding: None
    Severity: None
    Impact: The affected method surface is existing SyntheticControl; the PR only adds a DGP/tutorial. The tutorial’s in-space placebo, Firpo-Possebom confidence set, ADH robustness, and CWZ conformal descriptions align with the registry and implementation contracts (diff_diff/synthetic_control.py:L22-L31, diff_diff/synthetic_control.py:L611-L648, diff_diff/conformal.py:L21-L28).
    Concrete fix: None required.

Code Quality

  • Finding: None
    Severity: None
    Impact: generate_synthetic_control_data() validates core dimensions/timing, constructs one treated unit plus donors, and cleanly emits treatment, ever-treated, true-effect, and predictor columns (diff_diff/prep_dgp.py:L593-L683). Public exports are wired through prep and package __all__ (diff_diff/prep.py:L18-L31, diff_diff/__init__.py:L120-L136, diff_diff/__init__.py:L419-L430).
    Concrete fix: None required.

Performance

  • Finding: None
    Severity: None
    Impact: The heavier tutorial drift checks use a module-scoped fixture so the SCM/conformal workflow runs once (tests/test_t25_synthetic_control_policy_drift.py:L45-L94).
    Concrete fix: None required.

Maintainability

  • Finding: None
    Severity: None
    Impact: Documentation drift tracking links the new tutorial to SyntheticControl, SyntheticControlResults, conformal.py, and prep_dgp.py (docs/doc-deps.yaml:L646-L675, docs/doc-deps.yaml:L917-L927).
    Concrete fix: None required.

Tech Debt

  • Finding: None
    Severity: None
    Impact: Existing deferred SyntheticControl items remain tracked in TODO.md:L89-L90; this PR does not introduce untracked correctness debt.
    Concrete fix: None required.

Security

  • Finding: None
    Severity: None
    Impact: No secrets, credentials, or sensitive data were identified in the changed code/docs/tutorial surfaces.
    Concrete fix: None required.

Documentation/Tests

  • Finding: None
    Severity: None
    Impact: API docs, tutorial toctree, LLM guide, and drift tests are updated for the new generator/tutorial (docs/api/prep.rst:L131-L161, docs/index.rst:L100-L112, diff_diff/guides/llms-full.txt:L1990-L1993, tests/test_t25_synthetic_control_policy_drift.py:L97-L244). Static validation passed; runtime tests were not executable here due missing pytest/numpy.
    Concrete fix: None required.

@igerber igerber added the ready-for-ci Triggers CI test workflows label Jun 23, 2026
@igerber igerber merged commit 531db01 into main Jun 23, 2026
33 of 35 checks passed
@igerber igerber deleted the docs/scm-tutorial branch June 23, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant