Coppock, Alexander, and Alan S. Gerber, Donald P. Green, and Holger L. Kern. 2017. “Combining Double Sampling and Bounds to Address Non-ignorable Missing Outcomes in Randomized Experiments.” Political Analysis. 25(2): 188-206.

- Journal Site
- Online Appendix
- Replication Archive
- R package “attrition”
- Alan S. Gerber’s personal website
- Donald P. Green’s personal website
- Holger L. Kern’s personal website

Missing outcome data plague many randomized experiments. Common solutions rely on ignorability assumptions that may not be credible in all applications. We propose a method for confronting missing outcome data that makes fairly weak assumptions but can still yield informative bounds on the average treatment effect. Our approach is based on a combination of the double sampling design and non-parametric worst-case bounds. We derive a worst-case bounds estimator under double sampling and provide analytic expressions for variance estimators and confidence intervals. We also propose a method for covariate adjustment using post-stratification and a sensitivity analysis for non-ignorable missingness. Finally, we illustrate the utility of our approach using Monte Carlo simulations and a placebo-controlled randomized field experiment on the effects of persuasion on social attitudes with survey-based outcome measures.

Figure 1 from paper, showing the width of confidence intervals and identification regions as a function of second-round response rate:

```
# If not installed:
# install.packages("devtools")
# devtools::install_github("acoppock/attrition")
library(attrition)
set.seed(343) # For reproducibility
N <- 1000
# Potential Outcomes
Y_0 <- sample(1:5, N, replace=TRUE, prob = c(0.1, 0.3, 0.3, 0.2, 0.1))
Y_1 <- sample(1:5, N, replace=TRUE, prob = c(0.1, 0.1, 0.4, 0.3, 0.1))
R1_0 <- rbinom(N, 1, prob = 0.7)
R1_1 <- rbinom(N, 1, prob = 0.8)
R2_0 <- rbinom(N, 1, prob = 0.9)
R2_1 <- rbinom(N, 1, prob = 0.95)
# Covariate
strata <- as.numeric(Y_0 > 2)
# Random Assignment
Z <- rbinom(N, 1, .5)
# Reveal Initial Sample Outcomes
R1 <- Z*R1_1 + (1-Z)*R1_0 # Initial sample response
Y_star <- Z*Y_1 + (1-Z)*Y_0 # True outcomes
Y <- Y_star
Y[R1==0] <- NA # Mask outcome of non-responders
# Conduct Double Sampling
Attempt <- rep(0, N)
Attempt[is.na(Y)] <- rbinom(sum(is.na(Y)), 1, .5)
R2 <- rep(0, N)
R2[Attempt==1] <- (Z*R2_1 + (1-Z)*R2_0)[Attempt==1]
Y[R2==1 & Attempt==1] <- Y_star[R2==1 & Attempt==1]
df <- data.frame(Y, Z, R1, Attempt, R2, strata)
# Without post-stratification
estimator_ds(Y, Z, R1, Attempt, R2, minY=1, maxY=5, data=df)
```

```
## ci_lower ci_upper low_est upp_est low_var upp_var
## 0.048650401 0.437209790 0.181568350 0.304099481 0.006471977 0.006490723
```

```
# With post-stratification
estimator_ds(Y, Z, R1, Attempt, R2, minY=1, maxY=5, strata=strata, data=df)
```

```
## ci_lower ci_upper low_est upp_est low_var upp_var
## 0.134213533 0.482968974 0.246987534 0.375801359 0.004689110 0.004234475
```