Add survey real-data validation against R (API, NHANES, RECS) by igerber · Pull Request #266 · igerber/diff-diff

igerber · 2026-04-04T17:58:46Z

Summary

Validate diff-diff's survey variance against R's survey package using three real federal survey datasets
Suite A (API): 8 tests — TSL variance with strata, FPC, subpopulations, covariates, and Fay's BRR replicates from R's canonical apistrat dataset
Suite B (NHANES): 4 tests + 1 skipped — TSL with real CDC strata + PSU + nest=TRUE, using ACA young adult coverage provision (2007-08 vs 2015-16)
Suite C (RECS): 3 tests — JK1 replicate weight variance with 60 real EIA replicate columns
All metrics (ATT, SE, df, CI) match R to machine precision (< 1e-10 differences)
Add real-data section to survey tutorial (Section 10) demonstrating NHANES ACA DiD with actual CDC data
Document results in docs/benchmarks.rst with reproduction instructions

Methodology references

N/A — no estimator or math changes. This PR adds validation tests, benchmark scripts, and documentation only.

Validation

Tests added: tests/test_survey_real_data.py (15 tests + 1 skip)
R benchmark scripts: benchmarks/R/benchmark_realdata_{api,nhanes,recs}.R
Tutorial updated: docs/tutorials/16_survey_did.ipynb (Section 10: real NHANES data)
Results validated against published ACA literature (Antwi et al. 2013, Sommers 2012)

Security / privacy

Confirm no secrets/PII in this PR: Yes
NHANES data is public-use (no geographic identifiers, no individual names)
RECS data is public-use microdata from EIA

Generated with Claude Code

Validate diff-diff's survey variance estimation against R's survey package using three real-world datasets: California API (strata+FPC), NHANES ACA young adult coverage (strata+PSU+nest), and RECS 2020 (JK1 replicate weights). All 15 tests match R to machine precision (<1e-10 differences). Includes R benchmark scripts, Python download scripts, golden value JSON files, and a real-data section in the survey tutorial demonstrating the ACA dependent coverage provision DiD on actual CDC data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

igerber · 2026-04-04T18:01:59Z

/ai-review

Golden JSON files (4.3MB) and notebook outputs exceeded Codex's 1MB character limit. Exclude benchmarks/data/**/*.json, *.csv, and docs/tutorials/*.ipynb from the unified diff while keeping them in the --stat summary so the reviewer knows they changed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

igerber · 2026-04-04T18:08:05Z

/ai-review

igerber · 2026-04-04T18:10:18Z

/ai-review

igerber · 2026-04-04T18:12:40Z

/ai-review

igerber closed this Apr 4, 2026

igerber reopened this Apr 4, 2026

igerber closed this Apr 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add survey real-data validation against R (API, NHANES, RECS)#266

Add survey real-data validation against R (API, NHANES, RECS)#266
igerber wants to merge 2 commits intomainfrom
survey-real-data-validation

igerber commented Apr 4, 2026

Uh oh!

igerber commented Apr 4, 2026

Uh oh!

igerber commented Apr 4, 2026

Uh oh!

igerber commented Apr 4, 2026

Uh oh!

igerber commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Apr 4, 2026

Summary

Methodology references

Validation

Security / privacy

Uh oh!

igerber commented Apr 4, 2026

Uh oh!

igerber commented Apr 4, 2026

Uh oh!

igerber commented Apr 4, 2026

Uh oh!

igerber commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant