Workshop Descriptions

Day 1 (May 22, 2019)


Half-Day Workshops

Workshop 1: Propensity Score Analysis in the Context of Complex Survey Data [9-12am]
Ehsan Karim, University of British Columbia

Propensity score analyses (PSA) are widely used in analyzing observational datasets to reduce the impact of confounding due to observed covariates. Nationally representative population-based complex survey datasets are frequently used in many applied studies; most of which choose to ignore the complex survey design features. This is partly because there is a paucity in clear guidelines of how PSA should be implemented in a complex survey data analysis context. Only a few relatively recent studies have examined how to incorporate PSA in this context, and some of the recommendations are contradictory, inconclusive or not generalizable to all types of PSA. This workshop will be helpful for recognizing some of the challenges and open questions in this regard. The workshop is particularly focused on demonstrating the implementation of PSA in a complex survey data analysis context through an illustrative data analysis exercise. The prerequisite is knowledge of (i) multiple regression analysis as well as (ii) working knowledge in R (provided software codes will be annotated). Some background in survey data features is useful but not required.


Workshop 2: An Introduction to Instrumental Variables Methods [1pm-4pm]
Luke Keele, University of Pennsylvania
Hyunseung Kang, University of Wisconsin-Madison

Instrumental variables (IVs) are a popular tool in epidemiology, economics, and the social sciences for estimating causal effects in the presence of unobserved confounding. The recent growth in the availability of large-scale observational data offers new opportunities for IV designs, such as in Mendelian randomization with genome wide association studies or in electronic health records (EHR) with patient noncompliance. The IV framework requires a variable that is (1) independent of unmeasured confounders, (2) affects the treatment, and (3) affects the outcome only indirectly through its effect on the treatment. However, for IV designs to be useful they require careful evaluation of the key assumptions, and there are a wide variety of diagnostics tests required to evaluate the causal conclusions from an IV analysis.

The course will begin with the application of IVs to handle non-compliance in RCTs, a classic application of IVs. This portion of the course will introduce concepts related to compliance classes, the Wald estimator, two-stage least squares (2SLS), and local average treatment effects (LATE). The second portion of the course will focus on IVs based on natural experiments. Participants will learn how to carefully evaluate the validity of results from IV designs. The course will include numerous real-world applications of IVs to illustrate concepts and help course participants understand how to evaluate IV assumptions. Participants will receive practical exposure to IV methods, relevant R code, and a didactic introduction to some recent developments with direct application to large-scale observational data.


Full-Day Workshops

Workshop 3: Bayesian Causal Inference for Experimental and Observational Studies [9am-4pm]
Fabrizia Mealli, University of Florence,
Fan Li, Duke University
Peng Ding, Unuversity of California, Berkeley,
Laura Forastiere, Yale University,
Georgia Papadogeorgou, Duke University

The workshop will review the Bayesian approach to causal inference under the potential outcome (PO) framework which defines a causal effect as the comparison of the potential outcomes under different treatment conditions for the same units. The Bayesian paradigm is natural for causal inference, inherently a missing data problem under the PO approach. Recent advances in Bayesian methods for causal effects will be discussed; these will be organized by the classification of assignment mechanisms, discussing randomized, ignorable, latently ignorable, locally ignorable, and sequentially ignorable assignment mechanisms. For each assignment mechanism, estimands, assumptions, the general structure of Bayesian inference, and specific scientific applications will be described. There will be also hands-on work with R and R-Stan, going through specifying models for the potential outcomes and deriving posterior distributions of causal estimands using real and simulated data.

This workshop should serve as a tutorial for researchers in causal inference who are not familiar with the Bayesian approach.


Workshop 4: The tlverse software ecosystem for causal inference [9am-4pm]
Mark van der Laan, Alan Hubbard, Nima Hejazi, Ivana Malenica, and Rachael Phillips,
University of California, Berkeley

This full-day workshop will provide a comprehensive introduction to the field of targeted learning for causal inference and the corresponding tlverse software ecosystem ( In particular, we will focus on targeted minimum loss-based estimators of causal effects, including those of static, dynamic, optimal dynamic, and stochastic interventions. These multiply robust, efficient plug-in estimators use state-of-the-art, ensemble machine learning tools to flexibly adjust for confounding while yielding valid statistical inference. We will discuss the utility of this robust estimation strategy in comparison to conventional techniques, which often rely on restrictive statistical models and may therefore lead to severely biased inference. In addition to discussion, this workshop will incorporate both interactive activities and hands-on, guided R programming exercises, to allow participants the opportunity to familiarize themselves with methodology and tools that will translate to real-world causal inference analyses. It is highly recommended for participants to have an understanding of basic statistical concepts such as confounding, probability distributions, confidence intervals, hypothesis tests, and regression. Advanced knowledge of mathematical statistics may be useful but is not necessary. Familiarity with the R programming language will be essential.


Workshop 5: Graphical Model Identification Theory For Causal Inference and Missing Data Problems [9am-4pm]
Ilya Shpitser, Johns Hopkins University
Jamie Robins, Harvard University

This short course will present a graphical modeling framework for describing counterfactuals targets of inference that arise in causal inference and missing data problems, specifically average causal effects, direct and indirect effects, and other fixed functions of the full data distribution. We will show how to use this framework to derive standard identifying functionals for these targets under monotone missing at random models, ignorable, and sequentially ignorable models in causal inference, and standard mediation analysis models. In addition, we show how the framework is sufficiently general to obtain identification in the presence of hidden causes of exposures, or missingness not at random, with the Pearl’s front-door model, and Robins’ permutation model as examples.

Finally, we will present a class of functionals given by a truncated nested factorization that generalize the g-formula functionals to models with arbitrary hidden variables. We then present complete algorithms based on this factorization for distributions representing causal effects, path-specific effects, and responses to dynamic treatment regimes. These algorithms can be viewed as one line reformulations of the ID algorithm. We will also describe a general identification algorithm for graphical missing data models.