`vignettes/intro_medshift.Rmd`

`intro_medshift.Rmd`

We are interested in assessing the population intervention direct effect and the population intervention indirect effect, based on the effect decomposition of the population intervention effect introduced in Díaz and Hejazi (2020).

To proceed, we’ll use as our running example a simple data set from an observational study of the relationship between BMI and kids behavior, distributed as part of the `mma`

R package on CRAN. First, let’s load the packages we’ll be using and set a seed; then, load this data set and take a quick look at it

# preliminaries library(data.table) library(dplyr) library(sl3) library(medshift) library(mma) set.seed(429153) # load and examine data data(weight_behavior) dim(weight_behavior)

`## [1] 691 15`

head(weight_behavior)

```
## bmi age sex race numpeople car gotosch snack tvhours cmpthours
## 1 18.20665 12.2 F OTHER 5 3 2 1 4 0
## 2 22.78401 12.8 M OTHER 4 3 2 1 4 2
## 3 19.60725 12.6 F OTHER 4 2 4 2 NA NA
## 4 25.56754 12.1 M OTHER 2 3 2 1 0 2
## 5 15.07408 12.3 M OTHER 4 1 2 1 2 1
## 6 22.98338 11.8 M OTHER 4 1 1 1 4 3
## cellhours sports exercises sweat overweigh
## 1 0 2 2 1 0
## 2 0 1 8 2 0
## 3 NA <NA> 4 2 0
## 4 0 2 9 1 1
## 5 3 1 12 1 0
## 6 2 1 1 1 0
```

The documentation for the data set describes it as a “database obtained from the Louisiana State University Health Sciences Center, New Orleans, by Dr. Richard Scribner. He explored the relationship between BMI and kids behavior through a survey at children, teachers and parents in Grenada in 2014. This data set includes 691 observations and 15 variables.”

Unfortunately, the data set contains a few observations with missing values. As these are unrelated to the object of our analysis, we’ll simply remove these for the time being. Note that in a real data analysis, we might consider strategies to fully make of the observed data, perhaps by imputing missing values. For now, we simply remove the incomplete observations, resulting in a data set with fewer observations but much the same structure as the original:

`## [1] 567 15`

```
## bmi age sex race numpeople car gotosch snack tvhours cmpthours
## 1 18.20665 12.2 F OTHER 5 3 2 1 4 0
## 2 22.78401 12.8 M OTHER 4 3 2 1 4 2
## 4 25.56754 12.1 M OTHER 2 3 2 1 0 2
## 5 15.07408 12.3 M OTHER 4 1 2 1 2 1
## 6 22.98338 11.8 M OTHER 4 1 1 1 4 3
## 8 19.15658 12.1 F OTHER 3 3 2 1 0 0
## cellhours sports exercises sweat overweigh
## 1 0 1 2 1 0
## 2 0 0 8 2 0
## 4 0 1 9 1 1
## 5 3 0 12 1 0
## 6 2 0 1 1 0
## 8 1 0 1 3 0
```

For the analysis of this observational data set, we focus on the effect of participating in a sports team (`sports`

) on the BMI of children (`bmi`

), taking several related covariates as mediators (`snack`

, `exercises`

, `overweigh`

) and all other collected covariates as potential confounders. Considering an NPSEM, we separate the observed variables from the data set into their corresponding nodes as follows

Y <- weight_behavior_complete$bmi A <- weight_behavior_complete$sports Z <- weight_behavior_complete %>% select(snack, exercises, overweigh) W <- weight_behavior_complete %>% select(age, sex, race, numpeople, car, gotosch, tvhours, cmpthours, cellhours, sweat)

Finally, in our analysis, we consider an incremental propensity score intervention (IPSI), as first proposed by Kennedy (2018), wherein the *odds of participating in a sports team* is modulated by some fixed amount (\(0 \leq \delta \leq \infty\)) for each individual. Such an intervention may be interpreted as the effect of a school program that motivates children to participate in sports teams. To exemplify our approach, we postulate a motivational intervention that *triples the odds* of participating in a sports team for each individual:

delta_shift_ipsi <- 3

To easily incorporate ensemble machine learning into the estimation procedure, we rely on the facilities provided in the `sl3`

R package (Coyle et al. 2020). For a complete guide on using the `sl3`

R package, consider consulting https://tlverse.org/sl3, or https://tlverse.org (and https://github.com/tlverse) for the `tlverse`

ecosystem, of which `sl3`

is a major part. We construct an ensemble learner using a handful of popular machine learning algorithms below

# SL learners used for continuous data (the nuisance parameter M) xgb_contin_lrnr <- Lrnr_xgboost$new(nrounds = 50, objective = "reg:linear") enet_contin_lrnr <- Lrnr_glmnet$new(alpha = 0.5, family = "gaussian", nfolds = 3) lasso_contin_lrnr <- Lrnr_glmnet$new(alpha = 1, family = "gaussian", nfolds = 3) fglm_contin_lrnr <- Lrnr_glm_fast$new(family = gaussian()) contin_lrnr_lib <- Stack$new(enet_contin_lrnr, lasso_contin_lrnr, fglm_contin_lrnr, xgb_contin_lrnr) sl_contin_lrnr <- Lrnr_sl$new(learners = contin_lrnr_lib, metalearner = Lrnr_nnls$new()) # SL learners used for binary data (nuisance parameters G and E in this case) xgb_binary_lrnr <- Lrnr_xgboost$new(nrounds = 50, objective = "reg:logistic") enet_binary_lrnr <- Lrnr_glmnet$new(alpha = 0.5, family = "binomial", nfolds = 3) lasso_binary_lrnr <- Lrnr_glmnet$new(alpha = 1, family = "binomial", nfolds = 3) fglm_binary_lrnr <- Lrnr_glm_fast$new(family = binomial()) binary_lrnr_lib <- Stack$new(enet_binary_lrnr, lasso_binary_lrnr, fglm_binary_lrnr, xgb_binary_lrnr) logistic_metalearner <- make_learner(Lrnr_solnp, metalearner_logistic_binomial, loss_loglik_binomial) sl_binary_lrnr <- Lrnr_sl$new(learners = binary_lrnr_lib, metalearner = logistic_metalearner)

We may decompose the population intervention effect (PIE) in terms of a *population intervention direct effect* (PIDE) and a *population intervention indirect effect* (PIIE): \[\begin{equation*}
\overbrace{\mathbb{E}\{Y(A_\delta, Z(A_\delta)) -
Y(A_\delta, Z)\}}^{\text{PIIE}} +
\overbrace{\mathbb{E}\{Y(A_\delta, Z) - Y(A, Z)\}}^{\text{PIDE}}.
\end{equation*}\]

This decomposition of the PIE as the sum of the population intervention direct and indirect effects has an interpretation analogous to the corresponding standard decomposition of the average treatment effect. In the sequel, we will compute each of the components of the direct and indirect effects above using appropriate estimators as follows

- For \(\mathbb{E}\{Y(A, Z)\}\), the sample mean \(\frac{1}{n}\sum_{i=1}^n Y_i\) is sufficient;
- for \(\mathbb{E}\{Y(A_{\delta}, Z)\}\), an efficient one-step estimator for the effect of a joint intervention altering the exposure mechanism but not the mediation mechanism, as proposed in Díaz and Hejazi (2020); and,
- for \(\mathbb{E}\{Y(A_{\delta}, Z_{A_{\delta}})\}\), an efficient one-step estimator for the effect of a joint intervention altering both the exposure and mediation mechanisms, as proposed in Kennedy (2018) and implemented in the
`npcausal`

R package.

As given in Díaz and Hejazi (2020), the statistical functional identifying the decomposition term that appears in both the PIDE and PIIE \(\mathbb{E}\{Y(A_{\delta}, Z)\}\), which corresponds to altering the exposure mechanism while keeping the mediation mechanism fixed, is \[\begin{equation*}
\theta_0(\delta) = \int m_0(a, z, w) g_{0,\delta}(a \mid w) p_0(z, w)
d\nu(a, z, w),
\end{equation*}\] for which a one-step estimator is available. The corresponding *efficient influence function* (EIF) with respect to the nonparametric model \(\mathcal{M}\) is \(D_{\eta,\delta}(o) = D^Y_{\eta,\delta}(o) + D^A_{\eta,\delta}(o) + D^{Z,W}_{\eta,\delta}(o) - \theta(\delta)\). The one-step estimator may be computed using the EIF estimating equation, making use of cross-fitting (Zheng and van der Laan 2011; Chernozhukov et al. 2018) to circumvent any need for entropy conditions (i.e., Donsker class restrictions). The resultant estimator is \[\begin{equation*}
\hat{\theta}(\delta) = \frac{1}{n} \sum_{i = 1}^n D_{\hat{\eta}_{j(i)},
\delta}(O_i) = \frac{1}{n} \sum_{i = 1}^n \left\{ D^Y_{\hat{\eta}_{j(i)},
\delta}(O_i) + D^A_{\hat{\eta}_{j(i)}, \delta}(O_i) +
D^{Z,W}_{\hat{\eta}_{j(i)}, \delta}(O_i) \right\},
\end{equation*}\] which is implemented in the `medshift`

R package. We make use of that implementation to estimate \(\mathbb{E}\{Y(A_{\delta}, Z)\}\) via its one-step estimator \(\hat{\theta}(\delta)\) below

# let's compute the parameter where A (but not Z) are shifted theta_eff <- medshift(W = W, A = A, Z = Z, Y = Y, delta = delta_shift_ipsi, g_learners = sl_binary_lrnr, e_learners = sl_binary_lrnr, m_learners = sl_contin_lrnr, phi_learners = Lrnr_hal9001$new(), estimator = "onestep", estimator_args = list(cv_folds = 3)) summary(theta_eff)

```
## lwr_ci param_est upr_ci param_var eif_mean
## 18.765503 19.119824 19.474145 0.032681 -2.570542e-16
## estimator
## onestep
```

Recall that, based on the decomposition outlined previously, the population intervention direct effect may be denoted \(\beta_{\text{PIDE}}(\delta) = \theta_0(\delta) - \mathbb{E}Y\). Thus, an estimator of the PIDE, \(\hat{\beta}_{\text{PIDE}}(\delta)\) may be expressed as a composition of estimators of its constituent parameters: \[\begin{equation*} \hat{\beta}_{\text{PIDE}}({\delta}) = \hat{\theta}(\delta) - \frac{1}{n} \sum_{i = 1}^n Y_i. \end{equation*}\]

Based on the above, we may construct an estimator of the PIDE using quantities already computed. The convenience function below applies the simple delta method required in the case of a linear contrast between the two constituent parameters:

# convenience function to compute inference via delta method: EY1 - EY0 linear_contrast <- function(params, eifs, ci_level = 0.95) { # bounds for confidence interval ci_norm_bounds <- c(-1, 1) * abs(stats::qnorm(p = (1 - ci_level) / 2)) param_est <- params[[1]] - params[[2]] eif <- eifs[[1]] - eifs[[2]] se_eif <- sqrt(var(eif) / length(eif)) param_ci <- param_est + ci_norm_bounds * se_eif # parameter and inference out <- c(param_ci[1], param_est, param_ci[2]) names(out) <- c("lwr_ci", "param_est", "upr_ci") return(out) }

With the above convenience function in hand, we’ll construct or extract the necessary components from existing objects and simply apply the function:

# parameter estimates and EIFs for components of direct effect EY <- mean(Y) eif_EY <- Y - EY params_de <- list(theta_eff$theta, EY) eifs_de <- list(theta_eff$eif, eif_EY) # direct effect = EY - estimated quantity de_est <- linear_contrast(params_de, eifs_de) de_est

```
## lwr_ci param_est upr_ci
## -0.496085065 -0.007284226 0.481516613
```

As given above, we have for our estimate of the direct effect \(\hat{\beta}_{\text{PIDE}}({\delta}) =\) -0.007.

Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. 2018. “Double/Debiased Machine Learning for Treatment and Structural Parameters.” *The Econometrics Journal* 21 (1). https://doi.org/10.1111/ectj.12097.

Coyle, Jeremy R, Nima S Hejazi, Ivana Malenica, and Oleg Sofrygin. 2020. *sl3: Modern Pipelines for Machine Learning and Super Learning*. https://github.com/tlverse/sl3. https://doi.org/10.5281/zenodo.1342293.

Díaz, Iván, and Nima S Hejazi. 2020. “Causal Mediation Analysis for Stochastic Interventions.” *Journal of the Royal Statistical Society: Series B (Statistical Methodology)*. Wiley Online Library. https://doi.org/10.1111/rssb.12362.

Kennedy, Edward H. 2018. “Nonparametric Causal Effects Based on Incremental Propensity Score Interventions.” *Journal of the American Statistical Association*. Taylor & Francis.

Zheng, Wenjing, and Mark J van der Laan. 2011. “Cross-Validated Targeted Minimum-Loss-Based Estimation.” In *Targeted Learning*, 459–74. Springer.