Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- **`ChaisemartinDHaultfoeuille.by_path` and `paths_of_interest` now compose with `heterogeneity="<col>"`** (Web Appendix Section 1.5, Lemma 7). Per-path heterogeneity coefficient is computed by re-running the Lemma 7 regression on each path-restricted switcher subsample. The path filter (`path_groups: Optional[Set[int]]`) restricts eligibility to switchers ON path `p` inside the inner regression; the variance machinery (standard WLS vcov for non-survey, cell-period IF allocator for Binder TSL, group-level allocator for Rao-Wu replicate) is unchanged from the global heterogeneity path. Cohort dummies in the design matrix absorb baseline by construction, so multi-baseline switcher panels do not produce R-divergence (no parallel `UserWarning` like `controls` / `trends_linear`). Surfaces on `results.path_heterogeneity_effects` keyed `{path: {l: {beta, se, t_stat, p_value, conf_int, n_obs}}}` and on `results.to_dataframe(level="by_path")` via new always-present `het_*` columns (`het_beta`, `het_se`, `het_t_stat`, `het_p_value`, `het_conf_int_lower`, `het_conf_int_upper`), populated for positive-horizon rows when `heterogeneity` is set and NaN otherwise (mirrors the `cband_*` and `cumulated_*` always-present convention). Composes with `survey_design` (analytical Binder TSL + replicate-weight bootstrap) via the existing PR #408 IF allocator path; under replicate weights, every per-(path, horizon) fit appends `n_valid` to the shared `_replicate_n_valid_list` accumulator and the final `_effective_df_survey` recomputation reflects all per-path appends. R parity verified against `did_multiplegt_dyn(..., by_path=3, predict_het=list("het_x", c(1,2,3)))` on the new `multi_path_reversible_by_path_predict_het` golden-value scenario; a sibling global anchor `multi_path_reversible_predict_het` introduces the FIRST `predict_het` R-parity baseline in the repo (no prior `TestDCDHDynRParityHeterogeneity` existed). Both R calls use `dont_drop_larger_lower=TRUE` to match the Python `drop_larger_lower=False` requirement and to provide cohort variation at every horizon under reversal paths. Per-path SE matches global SE bit-exactly on a single-path panel (telescope invariant, `atol=rtol=1e-14`). Multiplier bootstrap (`n_bootstrap > 0`) under `by_path + heterogeneity + survey_design` inherits the existing per-path multiplier-bootstrap-survey gate from PR #408. The `NotImplementedError` gate at `chaisemartin_dhaultfoeuille.py:1230-1234` is removed; `heterogeneity` precondition mutex with `controls` / `trends_linear` / `trends_nonparam` stays in place. Cross-surface invariants regression-tested at `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathHeterogeneity` (~13 tests across gate dispatch, behavior, single-path telescope, zero-signal anti-regression, multi-baseline UserWarning anti-regression, DataFrame integration, edge cases) + `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityHeterogeneity` (global anchor) + `::TestDCDHDynRParityByPathHeterogeneity` (per-path). See `docs/methodology/REGISTRY.md` §`ChaisemartinDHaultfoeuille` `Note (Phase 3 by_path ...)` → "Per-path heterogeneity testing" for the full contract.
- **Tutorial 21: HAD Pre-test Workflow** (`docs/tutorials/21_had_pretest_workflow.ipynb`) — composite pre-test walkthrough for `HeterogeneousAdoptionDiD` building on Tutorial 20's brand-campaign framing. Uses a 60-DMA × 8-week panel close in shape to T20's but with the dose distribution drawn from `Uniform[$0.01K, $50K]` (vs T20's `[$5K, $50K]`); the true support is strictly positive but very near zero, chosen so the QUG step in `did_had_pretest_workflow` fails-to-reject `H0: d_lower = 0` in this finite sample and the verdict text fires the load-bearing "Assumption 7 deferred" pivot for the upgrade-arc narrative. (HAD's `design="auto"` selector — a separate min/median heuristic at `had.py::_detect_design`, NOT the QUG p-value — independently lands on the `continuous_at_zero` identification path with target `WAS` on this panel because `d.min() < 0.01 * median(|d|)`. The QUG test and the design selector are independent rules that point to the same identification path here.) Walks through three surfaces: (a) `did_had_pretest_workflow(aggregate="overall")` on a two-period collapse, where the verdict explicitly flags Step 2 (Assumption 7 pre-trends) as not run because a single pre-period structurally cannot support a pre-trends test, and the structural fields `pretrends_joint` / `homogeneity_joint` are both `None`; (b) `did_had_pretest_workflow(aggregate="event_study")` on the full multi-period panel, where the verdict reads "TWFE admissible under Section 4 assumptions" because all three testable diagnostics (QUG + joint pre-trends Stute over 3 horizons + joint homogeneity Stute over 4 horizons) fail-to-reject — non-rejection evidence under finite-sample power and test specification, not proof that the identifying assumptions hold; and (c) a side panel exercising both `yatchew_hr_test` null modes — `null="linearity"` (default, paper Theorem 7) vs `null="mean_independence"` (Phase 4 R-parity with R `YatchewTest::yatchew_test(order=0)`) — on the within-pre-period first-difference paired with post-period dose, illustrating the stricter null's larger residual variance (`sigma2_lin` 7.01 vs 6.53) and smaller p-value (0.29 vs 0.49). Companion drift-test file `tests/test_t21_had_pretest_workflow_drift.py` (16 tests pinning panel composition, both verdict pivots, structural anchors on both paths, deterministic QUG / Yatchew statistics, bootstrap p-value tolerance bands per `feedback_bootstrap_drift_tests_need_backend_tolerance`, and `HAD(design="auto")` resolution to `continuous_at_zero` on this panel). T20's "Composite pretest workflow" Extensions bullet updated with a forward-pointer to T21. T22 weighted/survey HAD tutorial remains queued as a separate notebook PR.
- **`ChaisemartinDHaultfoeuille.by_path` and `paths_of_interest` now compose with `survey_design`** for analytical Binder TSL SE and replicate-weight bootstrap variance. The `NotImplementedError` gate at `chaisemartin_dhaultfoeuille.py:1233-1239` is replaced by a per-path multiplier-bootstrap-only gate (`survey_design + n_bootstrap > 0` under by_path / paths_of_interest still raises, since the survey-aware perturbation pivot for path-restricted IFs is methodologically underived). Per-path SE routes through the existing `_survey_se_from_group_if` cell-period allocator: the per-period IF (`U_pp_l_path`) is built with non-path switcher-side contributions skipped (control contributions are unchanged, matching the joiners/leavers IF convention; preserves the row-sum identity `U_pp.sum(axis=1) == U`), cohort-recentered via `_cohort_recenter_per_period`, then expanded to observations as `psi_i = U_pp[g_i, t_i] · (w_i / W_{g_i, t_i})`. Replicate-weight designs unconditionally use the cell allocator (Class A contract from PR #323). New `_refresh_path_inference` helper post-call refreshes `safe_inference` on every populated entry across `multi_horizon_inference`, `placebo_horizon_inference`, `path_effects`, and `path_placebos` so all four surfaces use the same final `df_survey` after per-path replicate fits append `n_valid` to the shared accumulator. Path-enumeration ranking under `survey_design` remains unweighted (group-cardinality, not population-weight mass). Lonely-PSU policy stays sample-wide, not per-path. Telescope invariant: on a single-path panel, per-path SE matches the global non-by_path survey SE bit-exactly. **No R parity** — R `did_multiplegt_dyn` does not support survey weighting; this is a Python-only methodology extension. The global non-by_path TSL multiplier-bootstrap path is unaffected (anti-regression test `tests/test_chaisemartin_dhaultfoeuille.py::TestByPathSurveyDesignAnalytical::test_global_survey_plus_n_bootstrap_still_works` locks the per-path-only scope of the new gate). Cross-surface invariants regression-tested at `TestByPathSurveyDesignAnalytical` (~17 tests across gate / dispatch / analytical SE / replicate-weight SE / per-path placebos / `trends_linear` composition / unobserved-path warnings / final-df refresh regressions) and `TestByPathSurveyDesignTelescope`. See `docs/methodology/REGISTRY.md` §`ChaisemartinDHaultfoeuille` `Note (Phase 3 by_path ...)` → "Per-path survey-design SE" for the full contract.
- **Inference-field aliases on staggered result classes** for adapter / external-consumer compatibility. Read-only `@property` aliases expose the flat `att` / `se` / `conf_int` / `p_value` / `t_stat` names (matching `DiDResults` / `TROPResults` / `SyntheticDiDResults` / `HeterogeneousAdoptionDiDResults`) on every result class that previously only carried prefixed canonical fields: `CallawaySantAnnaResults`, `StackedDiDResults`, `EfficientDiDResults`, `ChaisemartinDHaultfoeuilleResults`, `StaggeredTripleDiffResults`, `WooldridgeDiDResults`, `SunAbrahamResults`, `ImputationDiDResults`, `TwoStageDiDResults` (mapping to `overall_*`); `ContinuousDiDResults` (mapping to `overall_att_*`, ATT-side as the headline, ACRT-side accessible unchanged via `overall_acrt_*`); `MultiPeriodDiDResults` (mapping to `avg_*`). `ContinuousDiDResults` additionally exposes `overall_se` / `overall_conf_int` / `overall_p_value` / `overall_t_stat` aliases for naming consistency with the rest of the staggered family. Aliases are pure read-throughs over the canonical fields — no recomputation, no behavior change — so the `safe_inference()` joint-NaN contract (per CLAUDE.md "Inference computation") is inherited automatically (NaN canonical → NaN alias, locked at `tests/test_result_aliases.py::test_pattern_b_aliases_propagate_nan`). The native `overall_*` / `overall_att_*` / `avg_*` fields remain canonical for documentation and computation. Motivated by the `balance.interop.diff_diff.as_balance_diagnostic()` adapter (`facebookresearch/balance` PR #465) which calls `getattr(res, "se", None)` / `getattr(res, "conf_int", None)` without a fallback chain — pre-alias, every staggered result class returned `None` on those keys, silently dropping `se` and `conf_int` from the adapter's diagnostic dict. 23 alias-mechanic + balance-adapter regression tests at `tests/test_result_aliases.py`. Patch-level (additive on stable surfaces).
Expand Down
185 changes: 185 additions & 0 deletions benchmarks/R/generate_dcdh_dynr_test_values.R
Original file line number Diff line number Diff line change
Expand Up @@ -618,6 +618,61 @@ extract_dcdh_by_path <- function(res, n_effects, n_placebos = 0) {
list(by_path = out)
}

# Helper: extract global predict_het results. R's predict_het slot is at
# res$results$predict_het, a data.frame with columns
# {effect, covariate, Estimate, SE, t, LB, UB, N, pF}. Estimate is the
# WLS coefficient on the heterogeneity covariate.
extract_dcdh_predict_het <- function(res, n_effects) {
ph <- res$results$predict_het
horizons <- list()
if (is.null(ph) || nrow(ph) == 0) return(list(predict_het = horizons))
for (h in seq_len(min(n_effects, nrow(ph)))) {
horizons[[as.character(ph$effect[h])]] <- list(
beta = as.numeric(ph$Estimate[h]),
se = as.numeric(ph$SE[h]),
t = as.numeric(ph$t[h]),
ci_lo = as.numeric(ph$LB[h]),
ci_hi = as.numeric(ph$UB[h]),
n_obs = as.numeric(ph$N[h]),
p_value = as.numeric(ph$pF[h])
)
}
list(predict_het = horizons)
}

# Helper: extract per-path predict_het results. Under by_path=k +
# predict_het, R's per-by_level dispatcher writes a predict_het table to
# each res$by_level_i$results$predict_het. Output mirrors
# extract_dcdh_by_path's shape with a horizons dict keyed by horizon.
extract_dcdh_by_path_predict_het <- function(res, n_effects) {
by_levels <- res$by_levels
out <- list()
for (i in seq_along(by_levels)) {
slot <- res[[paste0("by_level_", i)]]
ph <- slot$results$predict_het
horizons <- list()
if (!is.null(ph) && nrow(ph) > 0) {
for (h in seq_len(min(n_effects, nrow(ph)))) {
horizons[[as.character(ph$effect[h])]] <- list(
beta = as.numeric(ph$Estimate[h]),
se = as.numeric(ph$SE[h]),
t = as.numeric(ph$t[h]),
ci_lo = as.numeric(ph$LB[h]),
ci_hi = as.numeric(ph$UB[h]),
n_obs = as.numeric(ph$N[h]),
p_value = as.numeric(ph$pF[h])
)
}
}
out[[i]] <- list(
path = by_levels[i],
frequency_rank = i,
horizons = horizons
)
}
list(by_path_predict_het = out)
}

# Scenario 13: mixed_single_switch + by_path=2 (basic 2-path case).
# The mixed_single_switch DGP produces joiners (path 0,1,1,1) and
# leavers (path 1,0,0,0) as its only two observed paths at L_max=3, so
Expand Down Expand Up @@ -1018,6 +1073,136 @@ cat(" Scenario 19: multi_path_reversible_by_path_non_binary\n")
)
}

# Scenarios 20 + 21: predict_het R-parity (Wave 5 #11). Scenario 20 is
# the global anchor (predict_het without by_path); scenario 21 is the
# per-path version (by_path + predict_het). Both use the same DGP so the
# Python-side parity tests can calibrate atol on the global scenario and
# inherit the same atol for the per-path test. R's predict_het syntax
# requires a 2-element list: list(covariate_name, horizons_vec).
#
# DGP shape: 90 switchers + 30 never-treated controls, 10 periods, 3
# paths (0,1,1,1) / (0,1,0,0) / (0,1,1,0), F_g varies in {3,4,5}
# INDEPENDENTLY of path so each path has multiple cohorts. het_x is
# binary {0,1}, balanced across both switchers and controls. Effect =
# 5.0 + 3.0 * het_x to produce a detectable heterogeneity signal.
# Never-treated controls are required for R's predict_het to compute
# horizons l>=2 under reversal paths (otherwise R returns
# "max effects = 2" or empty cohort dummies). `dont_drop_larger_lower
# = TRUE` matches the Python by_path requirement that
# `drop_larger_lower=False`.
cat(" Scenarios 20/21: multi_path_reversible_predict_het + by_path version\n")
{
set.seed(120L)
n_switchers20 <- 90L
n_controls20 <- 30L
n_groups20 <- n_switchers20 + n_controls20
n_periods20 <- 10L
paths20 <- list(c(0L, 1L, 1L, 1L), c(0L, 1L, 0L, 0L), c(0L, 1L, 1L, 0L))
D20 <- matrix(0L, nrow = n_groups20, ncol = n_periods20)
het_x20 <- integer(n_groups20)
group_fe20 <- rnorm(n_groups20, 0, 2.0)
# Switchers (groups 1..n_switchers20)
for (g in seq_len(n_switchers20)) {
F_g_choice <- ((g - 1L) %/% 3L) %% 3L
F_g <- 3L + F_g_choice
path_idx <- ((g - 1L) %% 3L) + 1L
pp <- paths20[[path_idx]]
het_x20[g] <- if (g <= n_switchers20 %/% 2L) 1L else 0L
for (j in seq_along(pp)) {
t <- F_g - 1L + j
if (t >= 1L && t <= n_periods20) D20[g, t] <- pp[j]
}
if (F_g - 1L + length(pp) <= n_periods20) {
tail_t <- (F_g - 1L + length(pp)):n_periods20
D20[g, tail_t] <- pp[length(pp)]
}
}
# Never-treated controls (groups n_switchers20+1..n_groups20).
# Balanced het_x assignment so the heterogeneity covariate has
# variation in the control pool too.
for (g in (n_switchers20 + 1L):n_groups20) {
k <- g - n_switchers20
het_x20[g] <- if (k <= n_controls20 %/% 2L) 1L else 0L
}
noise20 <- matrix(rnorm(n_groups20 * n_periods20, 0, 0.5),
nrow = n_groups20, ncol = n_periods20)
period_arr20 <- 0:(n_periods20 - 1L)
effect20 <- 5.0 + 3.0 * het_x20 # heterogeneity signal
Y20 <- matrix(group_fe20, nrow = n_groups20, ncol = n_periods20) +
matrix(0.5 * period_arr20, nrow = n_groups20, ncol = n_periods20, byrow = TRUE) +
matrix(effect20, nrow = n_groups20, ncol = n_periods20) * D20 +
noise20
d20 <- data.frame(
group = rep(seq_len(n_groups20) - 1L, each = n_periods20),
period = rep(period_arr20, n_groups20),
treatment = as.vector(t(D20)),
outcome = as.vector(t(Y20)),
het_x = rep(het_x20, each = n_periods20)
)

# Scenario 20: global predict_het anchor (no by_path).
# `dont_drop_larger_lower = TRUE` preserves multi-switch cohorts so the
# heterogeneity regression has cohort variation at every horizon.
# Without this, R drops off-switch paths at l>=2, leaving a single
# cohort and triggering the `prod_het_l_XX ~ het_x + ` empty-cohort
# error. dCDH `by_path` requires `drop_larger_lower=False` on the Python
# side anyway, so this flag is consistent with the per-path scope.
res20 <- did_multiplegt_dyn(
df = d20, outcome = "outcome", group = "group", time = "period",
treatment = "treatment", effects = 3,
dont_drop_larger_lower = TRUE,
predict_het = list("het_x", c(1, 2, 3)),
ci_level = 95, graph_off = TRUE
)
scenarios$multi_path_reversible_predict_het <- list(
data = list(
group = as.numeric(d20$group),
period = as.numeric(d20$period),
treatment = as.numeric(d20$treatment),
outcome = as.numeric(d20$outcome),
het_x = as.numeric(d20$het_x)
),
params = list(pattern = "multi_path_reversible_predict_het",
n_switchers = n_switchers20, n_controls = n_controls20,
n_groups = n_groups20, n_periods = n_periods20,
seed = 120L, effects = 3,
predict_het_var = "het_x",
predict_het_horizons = c(1, 2, 3),
ci_level = 95,
dont_drop_larger_lower = TRUE),
results = extract_dcdh_predict_het(res20, n_effects = 3)
)

# Scenario 21: by_path + predict_het (per-path version on same DGP).
# `dont_drop_larger_lower = TRUE` matches scenario 20 + Python
# by_path's `drop_larger_lower=False` requirement.
res21 <- did_multiplegt_dyn(
df = d20, outcome = "outcome", group = "group", time = "period",
treatment = "treatment", effects = 3, by_path = 3,
dont_drop_larger_lower = TRUE,
predict_het = list("het_x", c(1, 2, 3)),
ci_level = 95, graph_off = TRUE
)
scenarios$multi_path_reversible_by_path_predict_het <- list(
data = list(
group = as.numeric(d20$group),
period = as.numeric(d20$period),
treatment = as.numeric(d20$treatment),
outcome = as.numeric(d20$outcome),
het_x = as.numeric(d20$het_x)
),
params = list(pattern = "multi_path_reversible_by_path_predict_het",
n_switchers = n_switchers20, n_controls = n_controls20,
n_groups = n_groups20, n_periods = n_periods20,
seed = 120L, effects = 3, by_path = 3,
predict_het_var = "het_x",
predict_het_horizons = c(1, 2, 3),
ci_level = 95,
dont_drop_larger_lower = TRUE),
results = extract_dcdh_by_path_predict_het(res21, n_effects = 3)
)
}

# ---------------------------------------------------------------------------
# Write output
# ---------------------------------------------------------------------------
Expand Down
Loading
Loading