Difference-in-differences for mediation analysis using double machine learning
Abstract: We propose a difference-in-differences (DiD) framework with mediation for possibly multivalued discrete or continuous treatments and mediators, aimed at identifying the direct effect of the treatment on the outcome (net of effects operating through the mediator), the indirect effect via the mediator, and the joint effects of treatment and mediator, consistent with the framework of dynamic treatment effects. Identification relies on a conditional parallel trends assumption imposed on the mean potential outcome across treatment and mediator states, or (depending on the causal parameter) additionally on the mean potential outcomes and potential mediator distributions across treatment states. We propose ATET estimators for repeated cross sections and panel data within the double/debiased machine learning framework, which allows for data-driven control of covariates, and we establish their asymptotic normality under standard regularity conditions. We investigate the finite-sample performance of the proposed methods in a simulation study and illustrate our approach in an empirical application to the US National Longitudinal Survey of Youth, estimating the direct effect of health care coverage on general health as well as the indirect effect operating through routine checkups.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
What is this paper about?
This paper introduces a new way to study cause-and-effect in “before and after” data when there’s a middle step (called a mediator) between a cause and the final result. It mixes a popular tool called difference-in-differences (DiD) with modern machine learning to carefully separate:
- the direct effect of a treatment (what the treatment does by itself),
- the indirect effect (what happens because the treatment changes the mediator),
- and the combined effect of choosing specific levels of both the treatment and the mediator.
The authors also show how to do this in two common types of data: when you observe different people at different times (repeated cross sections) and when you track the same people over time (panel data).
What questions are the researchers asking?
Put simply, they want to answer:
- How much does a treatment change the outcome for the people who actually got it (the “average treatment effect on the treated,” or ATET)?
- How much of that change is direct (not through any middle step)?
- How much is indirect (through a mediator, like a routine checkup)?
- What happens if we set specific values for the treatment and the mediator (the “dynamic” or joint effects)?
For example, if the treatment is “health insurance coverage” and the mediator is “goes to routine checkups,” the paper studies:
- Direct effect: What does coverage do for health if we fix whether someone gets checkups?
- Indirect effect: How much of coverage’s effect on health comes specifically from causing more routine checkups?
How did they study it?
To make these ideas accessible, here are the key parts in everyday language:
Key ideas explained simply
- Difference-in-Differences (DiD): Imagine two groups of people—one gets a treatment (like new health coverage) and one doesn’t. You compare how both groups change from “before” to “after.” If, without the treatment, the two groups would have followed similar trends, then the extra improvement in the treated group is attributed to the treatment. This “difference of differences” helps cancel out hidden, time-stable differences between the groups.
- Mediator: A mediator is a middle step in the chain of effects. For health coverage, a mediator could be “routine checkups.” The treatment (coverage) might boost checkups, which then improves health.
- Direct vs. indirect effects:
- Direct effect: What the treatment does to the outcome not via the mediator.
- Indirect effect: What the treatment does to the outcome because it changes the mediator.
- Dynamic (joint) effects: What happens when you set both the treatment and mediator to specific levels (like “coverage on, checkups off” vs. “coverage off, checkups on”).
- Double Machine Learning (DML): Think of DML as a smart helper that uses machine learning to adjust for many background factors (age, income, health history, etc.) without overfitting. It estimates “nuisance parts” (like predicted outcomes and group membership probabilities) and then plugs them into formulas that extract the causal effects.
- Doubly robust: The method builds two “backup plans” (one for outcomes, one for group/time probabilities). If one plan is a bit off, the other can still help keep the final estimate reliable.
- Neyman orthogonality: The formulas are designed so that small mistakes in those helper models (from machine learning) don’t easily mess up the main effect estimates—like shock absorbers that reduce the impact of bumps.
- Cross-fitting: The data is split so the machine learning models are trained on one part and evaluated on another. This avoids “cheating” (overfitting) and makes results more trustworthy.
- Repeated cross sections vs. panel data:
- Repeated cross sections: Different people before and after.
- Panel data: The same people observed at multiple times.
Assumptions (in plain terms)
To make these results valid, they assume:
- Conditional parallel trends: After adjusting for observed factors (like age and income), the “before-to-after” trend the treated group would have had without treatment matches the trend seen in the control group. For mediation, they adapt this idea to also account for the mediator.
- No anticipation: People don’t change their behavior before the treatment just because they expect it.
- Common support: For each type of person (based on observed factors), there are comparable individuals in all relevant groups (treated/untreated, before/after).
- Exogenous controls: The control variables they adjust for aren’t themselves changed by the treatment. (Otherwise, adjusting for them could accidentally remove part of the treatment’s true effect.)
- For “natural” direct/indirect effects: They sometimes also need a parallel trends idea to hold for the mediator’s distribution over time, not just the outcome.
What did they find?
- New estimators: They build formulas and estimators that can separate direct and indirect effects within a DiD setup, even when treatments and mediators aren’t just yes/no (they can be multi-level or continuous).
- Flexibility with machine learning: Their double machine learning approach lets researchers control for many background variables automatically and safely, thanks to cross-fitting and orthogonal designs.
- Theoretical guarantees: Under standard conditions, their estimators are mathematically well-behaved (asymptotically normal), which is important for making confidence statements.
- Simulations: In computer tests with several thousand observations, their methods perform well and give accurate results in realistic sample sizes.
- Real-world example (NLSY97): They analyze U.S. youth data to study health insurance coverage’s effect on general health, including the part that works through routine checkups. The point estimates suggest coverage might improve health, but they do not find statistically significant short-term effects among those who gain coverage—neither the total effect nor the direct/indirect pieces are clearly different from zero in the short run.
Why does it matter?
- Opens the “black box” of how treatments work: Instead of only asking “does it help?”, this method asks “how does it help?” and “by how much through each pathway?” That’s crucial for designing smarter policies.
- Handles complex realities: Treatments and mediators often aren’t just yes/no, and people’s backgrounds change over time. This framework deals with those complications.
- More reliable estimates: Using double machine learning and doubly robust designs reduces the risk that choosing the wrong model spoils the conclusions.
- Practical for modern data: Many studies have lots of variables; the method adapts to that and still keeps the causal story clean.
In short, the paper provides a careful, flexible, and modern way to break down treatment effects into direct and indirect parts in before-and-after studies, helping researchers and policymakers learn not just whether something works, but how and why.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
Below is a concise list of unresolved issues and concrete opportunities for further research arising from the paper’s assumptions, identification strategy, estimation framework, and empirical scope.
- Strength and credibility of mediation-specific parallel trends
- The conditional parallel trends across treatment–mediator combinations (including strata defined by potential mediators) is very strong and empirically untestable; the paper offers no sensitivity or partial-identification analyses to assess robustness when this assumption is violated. How to design sensitivity analyses, bounds, or negative-control strategies tailored to mediation DiD?
- For natural effects identified via Assumption 5, the requirement that unobservables affecting treatment do not directly affect the mediator (beyond treatment) is substantively demanding; methods leveraging instruments, proxies, or negative controls for the mediator–outcome confounding in a DiD-with-mediation setting are not developed.
- Exogeneity of covariates and post-treatment adjustment
- Assumption 4 (covariates not affected by treatment/mediator) is often implausible in repeated cross sections; the paper does not provide strategies to handle (or diagnose) conditioning on post-treatment covariates, nor front-door style adjustments or selection-on-latents approaches compatible with DiD mediation.
- Common support and overlap in high-dimensional treatment–mediator cells
- Identification requires overlap across four cells (D,M,T) for the target group and, with multivalued/continuous D and M, potentially across many cells; the paper does not develop diagnostics, trimming/overlap weighting, or stabilized-weight designs to handle sparsity, weak overlap, or rare mediator states.
- Dependence on pre-treatment mediator measures
- Identification of natural effects via distributional parallel trends requires observing a pre-treatment mediator M0; many applications lack this. Alternatives (e.g., proxy M0, panel instruments, or auxiliary samples) and their identification properties are left unexplored.
- Two-period setup and staggered/dynamic settings
- Methods are developed for two periods; generalizations to multiple pre- and post-periods, staggered adoption, event-study decomposition of direct/indirect effects, and time-varying mediator pathways are not provided.
- Single mediator restriction
- The framework treats one mediator; extensions to multiple mediators (possibly interacting or forming networks), mediator selection problems, and mediation path decomposition beyond a single M are not addressed.
- Continuous treatments/mediators: densities and implementation
- Although the abstract claims coverage of continuous D and M, core identification and DR score constructions are shown for discrete cases. How to consistently estimate required conditional densities/ratio weights with ML, ensure positivity for continuous supports, and derive orthogonal scores/influence functions is not detailed.
- Time-varying unobserved confounding
- DiD removes time-invariant confounding, but time-varying shocks that affect mediator and outcome in the post period can bias both controlled and natural effects; the paper does not propose remedies (e.g., triple-differences, differential trends models, negative-control time trends).
- Interference and spillovers
- SUTVA is assumed; no extensions address spillovers in treatment or mediator (e.g., peer effects, market-level shocks). Identification and estimation under partial interference or network spillovers in mediation DiD remain open.
- Measurement and classification error
- The framework is silent on mismeasurement/misclassification in D, M, M0, or Y (common in surveys). Bias characterization and corrections (e.g., validation subsamples, SIMEX, IV for M) are not discussed.
- Inference under clustering and serial correlation
- The asymptotics do not explicitly address clustered designs, serial correlation, or group-level shocks common in DiD. Guidance on cluster-robust variance, block bootstrap, or randomization inference in the DML-with-mediation setting is missing.
- Finite-sample performance under weak overlap and many cells
- Simulations consider “several thousand” observations but do not stress-test small samples, heavy regularization, rare treatment–mediator cells, or weak overlap. Empirical stabilization (e.g., weight truncation) and its bias–variance trade-offs are not analyzed.
- Efficiency and optimality
- Semiparametric efficiency bounds and the efficient influence function for the proposed mediation DiD parameters are not derived; it is unknown whether the DR/DML estimators are efficiency-optimal or how they compare to alternative semiparametric estimators.
- Heterogeneity and distributional effects
- The paper focuses on mean effects; estimation of heterogeneous direct/indirect effects across X, distributional/quantile mediation effects, and policy-relevant weighting schemes (e.g., overlap or transport weights) are not developed.
- Robust diagnostics and falsification tests
- There is no suite of diagnostics tailored to mediation DiD (e.g., joint pre-trend checks for outcome and mediator, placebo mediators/outcomes, leave-one-cell-out checks, balance/overlap diagnostics for (D,M,T) cells).
- Missing data and attrition (panel) and compositional change (repeated cross sections)
- Identification under item nonresponse, attrition, or changing sampling frames is not treated; weighting or selection models compatible with the mediation DiD structure are absent.
- Practical ML implementation details
- Guidance on algorithm choice, hyperparameter tuning, cross-fitting folds, and nuisance estimation for multi-cell propensities is minimal. Stable implementations for many (d,m,t) models, cross-validated density estimation (for continuous M/D), and software toolchains are not provided.
- Empirical design choices and robustness
- In the application, robustness to alternative mediator definitions, timing (lags/leads of M), or multiple mediators is not explored; a template for applied sensitivity to these design choices would aid practice.
- External validity and transportability
- The approach targets ATET-like parameters; methods to transport identified direct/indirect effects across populations, sites, or time (with covariate shift) are not developed.
- Noncompliance/principal stratification integration
- Although potential-mediator strata are discussed conceptually, the paper does not develop identification or estimation under explicit principal stratification or monotonicity-type restrictions as alternatives to strong cross-strata parallel trends.
Practical Applications
Immediate Applications
Below are concrete, near-term use cases that can be implemented with existing data and tooling by leveraging the paper’s DiD-with-mediation framework and double/debiased machine learning (DML) estimators (with cross-fitting and doubly robust, Neyman-orthogonal scores).
- Healthcare and public health
- Evaluate insurance or coverage expansions: Decompose the total effect of new coverage (e.g., Medicaid expansion, employer coverage) on health outcomes into direct effects and indirect effects via increased preventive care (e.g., checkups, screenings) or care access.
- Tools/workflows: R/Python implementations with DoubleML/EconML + DiD modules; nuisance estimation via lasso, random forests, or XGBoost; cross-fitting; outcome/propensity models by treatment–mediator–time cells; standard errors via influence functions.
- Assumptions/dependencies: Conditional parallel trends across treatment–mediator combinations; no anticipation effects; common support across four cells (treatment/mediator × pre/post); covariates not affected by treatment/mediator; for natural direct/indirect effects, either parallel trends across treatments or distributional parallel trends for the mediator, and pre-period mediator observed for the latter.
- Non-pharmaceutical interventions (NPIs) and infectious disease policy: Estimate how masking or mobility restrictions affect infections directly vs. indirectly through changes in mobility or contact rates.
- Tools/workflows: Merge policy timing, mobility, and case data; DiD mediation to separate direct policy impacts from behavior-mediated channels.
- Assumptions/dependencies: Stable parallel trends conditional on covariates like demographics and baseline mobility; SUTVA (limited spillovers between units) or design that mitigates interference.
- Education and ed-tech
- Policy or curriculum changes: Decompose the effect on test scores into a direct pedagogical effect vs. indirect effects through attendance, instructional time, or engagement metrics.
- Tools/workflows: Use administrative records (pre/post) with mediator (attendance/engagement) observed; apply DML-DiD to estimate controlled direct effects and natural indirect effects.
- Assumptions/dependencies: Mediator availability in pre- and post-periods (for distributional mediator parallel trends); common support across classes/schools; covariates exogenous to treatment.
- Labor, HR, and workforce development
- Training and placement programs: Decompose the effect on earnings into direct human-capital effects vs. mediated effects through job search intensity, networking, or credentialing.
- Tools/workflows: Administrative panels or repeated cross-sections; machine-learned nuisance functions; cross-fitting to manage high-dimensional covariates (skills tests, prior wages).
- Assumptions/dependencies: Parallel trends conditional on rich pre-treatment covariates; mediator not driven by unobservables that also confound treatment unless modeling distributional mediator trends with pre-period mediator data.
- Technology and product analytics (software platforms, e-commerce)
- Feature rollouts or UX changes: Separate the total effect on revenue/retention into direct effects and indirect effects via engagement (e.g., session length, click-through, notification reactions).
- Tools/workflows: Use A/B-like staggered rollouts with pre/post panels; estimate controlled direct effects (fix mediator) when product teams can hold engagement pathways constant (e.g., throttling notifications in a test cell).
- Assumptions/dependencies: Parallel trends across treatment–mediator cells for non-random rollouts; sufficient overlap in user features; careful handling of post-treatment variables as covariates (must be exogenous or avoided).
- Marketing and advertising
- Campaign evaluation: Quantify how media spend affects sales directly vs. indirectly via awareness or store visits (mediators measured via surveys or foot-traffic data).
- Tools/workflows: Combine campaign timing with repeated cross-sections (e.g., panels of zip codes or customers); DML estimators with orthogonal scores for robustness to high-dimensional controls.
- Assumptions/dependencies: Availability and stability of mediator measurement; common support across campaign and non-campaign regions/customers.
- Energy and environment
- Subsidies and standards: Decompose the effect of clean-energy subsidies or building codes on energy consumption into direct effects vs. mediated effects through technology adoption/retrofits.
- Tools/workflows: Utility billing panels, retrofit records as mediators; apply DiD mediation to attribute savings to adoption channels.
- Assumptions/dependencies: Pre-period mediator observed (for distributional mediator trends); parallel trends conditional on weather, socioeconomics.
- Finance and consumer banking
- Pricing or policy changes: Separate the effect of fee changes or nudges on balances or defaults into direct effects and indirect effects via product usage intensity or customer churn propensity (mediators).
- Tools/workflows: Bank administrative data; DML-DiD to manage large feature sets; cross-fitting.
- Assumptions/dependencies: Adequate overlap across fee-policy cohorts; mediator not driven by unmeasured, time-varying confounders that also affect treatment.
- Program evaluation and academia
- Mechanism-aware re-analyses of existing DiD studies: Revisit prior policy evaluations to quantify direct vs. mediated channels (e.g., labor policies via hours worked, health policies via care utilization).
- Tools/workflows: Replication code built on DoubleML/EconML; transparent reporting of controlled and natural direct/indirect effects with confidence intervals.
- Assumptions/dependencies: Access to mediators in pre/post data; diagnostics for common support and parallel trends within mediator-specific cells.
- Daily operations in SMEs and NGOs
- Operational changes (e.g., new service hours, delivery guarantees): Split effects on satisfaction or donations into direct effects and mediated effects via service reliability or response rates.
- Tools/workflows: Lightweight pipelines using scikit-learn or R’s mlr3 for nuisance estimation; K-fold cross-fitting.
- Assumptions/dependencies: Sufficient sample sizes (the paper’s simulations favor several thousand observations); mediator measured consistently across time.
Long-Term Applications
These ideas require further methodological development, scaling, or tooling beyond what is immediately available, though they build directly on the paper’s innovations.
- Standardized, open-source “DiD Mediation” packages
- Sector: Cross-sector (academia, industry, policy)
- Description: Dedicated R/Python libraries implementing all estimands in the paper (ATET, ATE, controlled direct effects, natural direct/indirect effects) for repeated cross-sections and panels, with built-in cross-fitting, orthogonal scores, standard errors, and assumption diagnostics.
- Dependencies/assumptions: Community validation on multiple designs (staggered adoption, multivalued/continuous mediators); guidance for cluster-robust inference; benchmarks for finite-sample behavior.
- Assumption-diagnostic and sensitivity-analysis suites
- Sector: Policy evaluation, academia
- Description: Tools for parallel-trend diagnostics within treatment–mediator cells, placebo/lead-lag tests, overlap checks, and sensitivity bounds for violations of exogeneity or “no anticipation” (for outcome and mediator).
- Dependencies/assumptions: Formal tests/visualizations generalized to mediator-specific groups; interpretable reports for non-technical stakeholders.
- Multi-period, staggered-adoption mediation DiD
- Sector: Public policy, tech product rollouts, energy standards
- Description: Generalize to many periods and staggered timing with mediators, harmonizing with modern DiD estimands while preserving orthogonality and DR properties.
- Dependencies/assumptions: Careful aggregation/weighting across cohorts; extensions of parallel trends to dynamic mediator paths; scalable computation.
- Real-time policy and experimentation dashboards
- Sector: Government, healthcare systems, large platforms
- Description: Streaming implementations to monitor direct and mediated effects as new data arrive (e.g., weekly), supporting agile policy adjustments and feature rollbacks.
- Dependencies/assumptions: Stable data pipelines; online cross-fitting; governance for inference under repeated looks and adaptive decision-making.
- Design of interventions targeting mediators
- Sector: Healthcare, education, labor, product growth
- Description: Use estimated natural indirect effects to prioritize policies that most effectively move mediators (e.g., preventive visits, attendance, engagement), and simulate impact under alternative mediator distributions.
- Dependencies/assumptions: Valid identification of mediator distribution under counterfactual treatment; capacity to effect mediator changes in practice (policy levers).
- Privacy-preserving and federated mediation DiD
- Sector: Healthcare consortia, finance, public agencies
- Description: Federated implementations of DML-DiD mediation with differential privacy, enabling cross-organization evaluation without sharing raw data.
- Dependencies/assumptions: Methods for secure aggregation of orthogonal scores; DP-aware variance estimation; additional research on privacy–utility trade-offs.
- Fairness- and compliance-aware mechanism analysis
- Sector: Finance, hiring, lending, insurance
- Description: Detect and constrain mediated pathways that raise fairness or regulatory concerns (e.g., discouraging reliance on sensitive mediators); report decomposed effects by protected groups.
- Dependencies/assumptions: Sufficient subgroup sample sizes; fairness constraints integrated with DML estimation; scrutiny of SUTVA and spillovers across groups.
- High-dimensional mediators and unstructured data
- Sector: Tech, marketing, public policy
- Description: Extend to vector- or function-valued mediators (e.g., text-derived sentiment, image features, clickstream embeddings), with regularization and orthogonalization strategies.
- Dependencies/assumptions: New identification results for high-dimensional mediator distributions; scalable nuisance learners; careful handling of post-treatment feature leakage.
- Optimal policy learning informed by mechanisms
- Sector: Public policy, digital platforms
- Description: Combine mechanism-aware DiD estimates with policy optimization (e.g., reinforcement learning) to select interventions that maximize direct and indirect gains while respecting constraints (costs, fairness).
- Dependencies/assumptions: Off-policy evaluation that respects DiD assumptions; stability of mediator–outcome relationships over policy changes.
- Power and design guidance for mediation DiD
- Sector: Academia, evaluation units
- Description: Simulation-based tools to plan sample sizes and allocation across treatment–mediator–time cells; guidance on measuring mediators in the pre-period to enable distributional mediator parallel trends.
- Dependencies/assumptions: Realistic data-generating processes calibrated to sectoral contexts; integration with institutional data collection.
- Robustness to interference and network spillovers
- Sector: Public health, education, platforms
- Description: Extend identification to allow bounded spillovers (violations of SUTVA), especially when mediators (e.g., behavior change) propagate through networks.
- Dependencies/assumptions: New theory for network-robust parallel trends; data on network structure; cluster- or exposure-mapping designs.
- Policy standards and reporting guidelines
- Sector: Government agencies, international organizations
- Description: Develop best-practice protocols for reporting direct/indirect effects in DiD studies (checklists for assumptions, mediator measurement, diagnostics, decomposition).
- Dependencies/assumptions: Consensus-building across statisticians, economists, and policy analysts; training and certification programs.
Notes on feasibility across applications:
- Data: Requires pre- and post-treatment outcomes; a mediator measured post-treatment and, for some natural-effect identification paths, also pre-treatment; rich covariates unaffected by treatment/mediator; adequate overlap across treatment–mediator–time cells.
- Assumptions: Conditional parallel trends (possibly across treatment–mediator combinations); no anticipation for outcomes (and mediators, if used); exogenous covariates; SUTVA or study designs limiting spillovers.
- Estimation: Large samples (the paper’s simulations favor several thousand observations) improve finite-sample performance; cross-fitting and orthogonal scores are critical for robustness when using ML for nuisance functions.
Glossary
- Approximate sparsity: A high-dimensional modeling property where the true function can be well-approximated by a small number of nonzero coefficients. "approximate sparsity when lasso regression is used for nuisance estimation"
- Asymptotic normality: The behavior of an estimator whose sampling distribution converges to a normal distribution as sample size grows. "we establish their asymptotic normality under standard regularity conditions"
- Average treatment effect (ATE): The average causal effect of a treatment across the entire population. "the average treatment effect (ATE) in the total population"
- Average treatment effect on the treated (ATET): The average causal effect of a treatment among those who actually received it. "This permits identifying the average treatment effect on the treated (ATET)."
- Compliance: In causal inference, a subject’s behavior with respect to following treatment, often defined via potential mediator or treatment states. "closely related to compliance in instrumental variable contexts"
- Common support: An identification requirement ensuring overlap in covariate distributions across comparison groups. "Assumption {\bf (Common support):}"
- Conditional parallel trends (across treatment–mediator combinations): The assumption that, given covariates, the change in mean potential outcomes over time is equal across specific treatment–mediator groups. "Assumption {\bf (Conditional parallel trends across treatment-mediator combinations):}"
- Controlled direct effect: The effect of the treatment on the outcome when holding the mediator fixed at a specified value. "represents the controlled direct effect when fixing the mediator at ."
- Cross-fitting: A sample-splitting technique to reduce overfitting by estimating nuisance functions and target scores on separate folds. "we further employ cross-fitting to ensure that nuisance parameters and score functions are not estimated on the same subsamples"
- Debiased machine learning (DML): A framework that uses orthogonal scores and machine learning to estimate causal parameters with reduced bias. "double/debiased machine learning (DML) framework"
- Difference-in-differences (DiD): A method that compares pre–post changes in outcomes between treated and control groups to identify causal effects. "Difference-in-differences (DiD) \citep{Snow1855, Ashenfelter78} is among the most popular methods for treatment evaluation"
- Distributional parallel trends (in the mediator): The assumption that, conditional on covariates, the change over time in the mediator’s distribution is equal across treatment groups. "Assumption {\bf (Conditional distributional parallel trends in the mediator):}"
- Doubly robust (DR): An estimator property ensuring consistency if either the outcome model or the treatment/propensity model is correctly specified. "doubly robust (DR) score functions"
- Dynamic treatment effects: Causal effects associated with sequences or joint settings of treatment and mediator values over time. "consistent with the framework of dynamic treatment effects."
- Identification: The ability to uniquely recover a causal parameter from observed data under specified assumptions. "Identification relies on a conditional parallel trends assumption"
- Instrumental variable contexts: Settings where instruments are used to address endogeneity and identify causal effects. "instrumental variable contexts"
- Inverse probability weighting: A method that weights observations by the inverse of their propensity scores to correct for selection. "proposed by \cite{abadie2005} (inverse probability weighting)"
- Lasso regression: A regularized regression technique using an L1 penalty to promote sparsity in coefficient estimates. "lasso regression is used for nuisance estimation"
- Mediation analysis: The study of how a treatment affects an outcome through intermediate variables (mediators). "Difference-in-differences for mediation analysis using double machine learning"
- Mediator: An intermediate variable through which part of a treatment’s effect on the outcome is transmitted. "mediators"
- Monotonicity (of the mediator): An assumption that the mediator does not decrease when treatment is applied. "monotonicity of the mediator in treatment"
- Natural direct effect: The effect of treatment on the outcome when the mediator is set to the value it would take under a particular treatment condition. "natural direct effect"
- Natural indirect effect: The portion of the treatment effect that operates through changes induced in the mediator. "natural indirect effect"
- Neyman orthogonality: A property of score functions that makes estimators first-order insensitive to errors in nuisance estimates. "which satisfies Neyman orthogonality:"
- No anticipation (assumption): The assumption that pre-treatment outcomes or mediators are unaffected by future treatment or mediator assignments. "Assumption {\bf (No anticipation of effect on outcome):}"
- Nuisance parameters: Auxiliary, non-target functions (e.g., outcome and propensity models) needed for estimation but not of direct interest. "insensitive—to estimation errors in the nuisance parameters"
- Panel data: Data where the same individuals are observed repeatedly over time. "both repeated cross sections and panel data"
- Parallel trends assumption: The requirement that treated and control groups would have had the same trends in outcomes absent treatment. "invoking a parallel trend assumption"
- Potential outcomes: Hypothetical outcomes a unit would exhibit under different treatment or mediator states. "the potential outcome framework"
- Principal stratification: A causal framework defining subgroups by joint potential values (e.g., of the mediator) to analyze effects within strata. "fits into the causal framework of principal stratification"
- Propensity score: The probability of receiving a treatment (and possibly mediator/time cell) given covariates, used for adjustment. "also known as propensity score"
- Repeated cross sections: Data consisting of different individuals observed at different times rather than the same individuals over time. "repeated cross sections"
- Score functions: Estimating equations used to construct estimators, often designed to be orthogonal to nuisance errors. "DR score functions"
- Selection bias: Bias arising from systematic differences (often due to unobservables) between treated and control groups. "This parallel trend assumption imposes restrictions on the selection bias arising from unobserved confounders"
- Selection-on-observables: Identification strategy assuming no unobserved confounding conditional on observed covariates. "selection-on-observables assumptions"
- Semiparametric: Methods or models that combine parametric components with nonparametric flexibility. "semiparametric DiD framework"
- Stable unit treatment value assumption (SUTVA): Assumption of no interference between units and consistency of potential outcomes with observed treatments. "stable unit treatment value assumption (SUTVA)"
- Staggered treatment adoption: Settings where different groups adopt treatment at different times. "including staggered treatment adoption across groups"
- Two-way fixed effects: Linear models with unit and time fixed effects, often used for DiD but prone to misspecification. "two-way fixed effects models"
Collections
Sign up for free to add this paper to one or more collections.