Causal Effect Estimation with Observational ... - Iowa SAS® User

Causal Effect Estimation with Observational ... - Iowa SAS® User

Causal Effect Estimation with Observational Data: Methods and Applications Part I Michael Lamm and Yiu-Fai Yung SAS Institute 2018 Iowa and Nebraska SAS Users Groups C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . The central issue is about how to estimate causal effects from observational data. C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Causal analysis can address some practical research questions Medicine: How does smoking effect blood pressure? Social policy: Can a particular youth program reduce the juvenile crime rate? Behavioral science: Does music training enhance academic performance?

The generic causal question: Does T (binary treatment) cause Y (outcome)? C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Outline Part I Issues of causal inference from observational data Introducing the propensity score Theories and assumptions Matching methods

Part II Weighting methods Doubly robust methods Limitations Summary and conclusions C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Software for estimating causal treatment effects Two procedures in SAS/STAT 14.2: PROC PSMATCH: creates appropriate data sets that behave like data you would have collected from randomized experiments PROC CAUSALTRT: estimates causal treatment effects by weighting, regression adjustment, and doubly robust methods

In SAS/STAT 14.3: PROC CAUSALMED: estimates causal mediation effects C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Causal Analysis in Experimental and Observational Studies C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . An Experimental Study GPA (Academic Performance) Music (Music Training)

Does music training enhance academic performance? Subjects are randomly assigned to the treatment and control conditions Observed association between T (Music) and Y (GPA)= causal effect C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Establishing the Causal Effect from an Observational Study Sports GPA (Academic Performance) Music (Music Training) Observational studies: Subjects select the treatment Observed association between T (Music) and Y (GPA) = causal effect + confounding associations C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Confounding Variables

Pretreatment characteristics that are associated with the treatment (T) and the outcome (Y) variables Usually represented generically as common causes of T and Y A confounding pretreatment characteristic can take two roles: Explain Affect parts of the treatment-outcome association the propensity of receiving treatment C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Dealing with confounding pretreatment characteristics Matching on confounding pretreatment characteristics: Covariate

matching methods Propensity score matching and stratification methods Adjusting for the confounding pretreatment characteristics or propensity of receiving treatment: Weighting methods Regression adjustment methods Doubly robust methods C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Introducing the Propensity Score Methods C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . What happens when you have many confounding pretreatment characteristics? X1 T (Treatment) X2

.. Y (Outcome) C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Xn Using the propensity score can reduce the complexity of the matching problem X1 T (Treatment) X2 .. Y (Outcome) A propensity score is the probability of receiving treatment given : C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Xn Propensity score matching methods are

Easier to apply than trying to match all pretreatment characteristics Theoretical foundation: uses the potential outcomes framework to clearly define causal effects and the conditions necessary for their unbiased estimation C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Example 1. Using the Optimal Matching Method of the PSMATCH Procedure C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Simulated School data (60 with music training, 140 without) Obs 1 2 3 4 5 6 7 8 . . .

Student ID Music 18 61 95 41 19 51 110 87 . . . No No No No Yes No

No No . . . Sports Yes No Yes No Yes Yes No Yes . . . Absence 3.71 2.08 2.54 3.01 0.08 1.20 2.21

2.30 . . . Gpa 3.14 3.32 3.31 3.14 4.35 3.88 3.21 3.28 . . . Music is the treatment variable Gpa is the outcome variable Sport and Absence are pretreatment characteristics C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . SAS Code proc psmatch data=School region=cs; class Music Sports; psmodel Music(Treated='Yes')= Sports Absence;

match method=optimal(k=1) exact=Sports stat=lps caliper=0.25; assess ps var=(Sports Absence) / plots=all weight=none; output out(obs=match)=OutEx1 matchid=_MatchID; run; The requested optimal 1-1 method (method=optimal(k=1)) matches a distinct control unit (Music='No') to each treated unit (Music='Yes') A matched sample is saved in the output data set OutEx1 C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . The output data set contains 120 matched observations (60 with music training, 60 without) Student _Match Obs ID Music

1 33 Yes 2 82 3 Sports Absence Gpa _PS_ _MATCHWGT_ ID Yes

3.50 3.86 0.09283 1 1 No Yes 3.50 3.21 0.09296 1 1 67 Yes

Yes 2.71 3.81 0.13790 1 2 4 95 No Yes 2.54 3.31 0.15009

1 2 5 47 No Yes 2.50 3.39 0.15300 1 3 6 4 Yes

Yes 2.49 3.78 0.15333 1 3 7 37 No No 2.94 3.90 0.15549

1 4 8 152 Yes No 2.88 3.74 0.15988 1 4 . . .

. . . . . . . . . . . . .

. . C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Music training does not have a causal effect on academic performance (Gpa) PS-matched sample: Nonsignificant causal effect proc ttest data=OutEx1; class Music; var Gpa; run; Variable: Music Gpa Mean Method No Yes Diff (1-2) Diff (1-2)

Pooled Satterthwaite 3.8517 3.8667 -0.0149 -0.0149 95% CL Mean 3.7829 3.8002 -0.1096 -0.1096 The 95% confidence intervals for the difference cover 0 3.9206 3.9331 0.0798 0.0798 Original sample: Significant but biased effect proc ttest data=School; class Music;

var Gpa; run; Variable: Music Method No Yes Diff (1-2) Diff (1-2) Pooled Satterthwaite Gpa Mean 3.6959 3.8667 -0.1708 -0.1708 95% CL Mean 3.6381 3.8002

-0.2688 -0.2582 3.7537 3.9331 -0.0728 -0.0833 The 95% confidence intervals for the difference do not cover 0 C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Theories and Assumptions C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Potential outcomes framework Neyman (1923), Rubin (1974) Imagine that each subject can participate in both treatment and control conditions: : potential outcome in the treatment condition : potential outcome in the control condition

Individual level causal effect: Average treatment effect (ATE): E Average treatment effect for the treated (ATT): E C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . The stable unit treatment value assumption (SUTVA) ensures that causal effects are well-defined (Rubin 1983) No hidden levels of treatment No interference among subjects C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . The consistency assumption relates the observed outcomes to the potential outcomes

You can observe at most one of the potential outcomes Missing data problem: What can we know about the missing potential outcomes? C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Definitions and assumptions apply to both randomized experiments and observational studies Same causal effect definitions by the difference in potential outcome means Same SUTVA for defining the causal effects Same consistency assumption about observing potential outcomes C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d .

What enables randomized experiments to obtain unbiased estimation of causal effect? You can safely assume the independence of potential outcomes and treatment assignment C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . As a consequence, for randomized experiments Therefore, You can use unadjusted treatment and control means to estimate causal effects C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . What about observational data? In general, you cannot assume:

So, You cannot use unadjusted treatment and control means to estimate causal effects C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . The strong ignorability assumption ensures the identification of treatment effects No unmeasured confounding: Positivity: is the probability of receiving treatment given C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Under these assumptions the propensity score becomes

a natural basis for the matching problem ( ) ( ) ( ) , = 0,1 The propensity score methods lead to unbiased estimation of causal effects (Rosenbaum and Rubin 1983) In practice, a correct modeling of is required C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . ATE or ATT? A standard answer: ATT is for policy making What is your research question? What is your application? Are you interested in knowing the treatment effect if

everybody in the population took the treatment? Are you interested in knowing the treatment effect only for those who take the treatment voluntarily? C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Matching aims to extract comparable groups that differ only in their treatment assignment The ignorability assumption independence of T and potential outcomes conditional on the background characteristics Balance in all covariates balance in propensity scores; Balance in propensity scores balance in covariates? C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Excluding the outcome from the matching process allows

you to consider multiple propensity score models No (Re-) Specify a propensity score model Good covariate balance? Yes Outcome analysis C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Example 2. Optimal Variable Ratio Matching C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Does quitting smoking lead to weight change? Data: A subset (N=1,746) of NHANES I Epidemiologic Follow-Up Study (NHEFS) in Hernn and Robins (2016)

Collect medical and behavioral information in an initial physical examination Follow-up interviews were done approximately 10 years later Treatment variable Quit: quit smoking during the 10year period Outcome variable Change: change in weight (in kg) C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Other Variables

Activity: Level of daily activity, with values 0, 1, and 2 Age: Age in 1971 BaseWeight: Weight in kilograms in 1971 Education: Level of education, with values 0, 1, 2, 3, and 4 Exercise: Amount of regular recreational exercise, with values 0, 1, and 2 PerDay: Number of cigarettes smoked per day in 1971 Race: 0 for white; 1 otherwise Sex: 0 for male; 1 for female Weight: Weight in kilograms at the follow-up interview YearsSmoke: Number of years an individual has smoked C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Data Set data SmokingWeight; input Sex Age Race Education Exercise BaseWeight Weight Change Activity YearsSmoke PerDay Quit; datalines;

0 42 1 1 2 79.04 68.95 -10.09 0 29 30 0 0 36 0 2 0 58.63 61.23 2.60 0 24 20 0 1 56 1 2 2 56.81 66.22 9.41 0 26 20 0 0 68 1 1 2 59.42 64.41 4.99 1 53 3 0 0 40 0 2 1 87.09 92.08 4.99 1 19 20 0 ... more lines ... 0 1 1 0 0 0 ; 45 47 51 68 26 29 0 0

0 0 0 0 1 1 3 1 . 2 0 0 0 1 0 1 63.05 57.72 62.71 52.39 86.75 90.83 64.41

61.23 . 57.15 87.54 106.59 1.36 3.51 . 4.76 0.79 15.76 0 0 0 1 0 1 29 31 30 46 9 14

40 20 40 15 20 30 0 0 0 0 0 1 C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . SAS code for optimal variable ratio matching proc psmatch data=smokingweight ; class Sex Race Education Exercise Activity Quit ; psmodel Quit(Treated='1') = Sex Age Education Exercise Activity YearsSmoke PerDay; match method=varratio(kmin=1 kmax=4) distance=lps caliper=.5; assess lps var=(age YearsSmoke)/plots=all; output out(obs=matched)=smokeMatched matchattwgt=matchattwgt

matchId =MatchId; run; The kmin and kmax options control range for the ration of treated to control units in matched groups The matchattwgt= option names the matching ATT weights in the output data set C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . The matching weights are used in assessing balance and in the outcome analysis Obs age Quit ... _PS_ matchattwgt 1

33 0 ... 0.09627 0.25 1 2 28 0 ... 0.09586 0.25 1 3

37 0 ... 0.09697 0.25 1 4 31 1 ... 0.09656 1.00 1

5 39 0 ... 0.09692 0.25 1 6 28 0 ... 0.09580 0.25 2

7 47 1 ... 0.09815 1.00 2 8 27 0 ... 0.09709 0.25

2 . . . . . . . . . . . . .

. C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . MatchId How matching weights are computed? For a matched set of units let: be the total number of units in the set be the number of treated units in the set be the total number of control units in the set The ATT matching weights (matchattwgt) for set sum to and are: : if the unit is in the treatment condition : if the unit is in the control condition The ATE matching weights for set sum to and are: : if the unit is in the treatment condition : if the unit is in the control condition C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d .

Information on the input data and support region C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Assessing covariate balance in PROC PSMATCH assess lps var=(age YearsSmoke) / plots=all; The ASSESS statement LPS: requests the balance assessment for the logit of propensity scores VAR=(age YearsSmoke): lists the two variables for balance assessment PLOTS=ALL: displays all the plots for assessing covariate balance C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Distribution of the Logit of Propensity Scores C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Standardized mean differences (Treated Control)

C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Visualize the standardized mean differences C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Distributions are plotted for continuous variables C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Use PROC FREQ to examine categorical variables proc freq data=smokeMatched; tables quit*Exercise quit*Activity; weight matchattwgt; run; C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Distributions of Exercise and Activity for the treatment conditions C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . Estimation of the ATT proc ttest data=smokeMatched; class quit; var change;

weight matchattwgt; run; Variable: Quit 0 1 Diff (1-2) Diff (1-2) Method Pooled Satterthwaite Change Mean 1.2330 4.5249 -3.2920 -3.2920 C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d . 95% CL Mean 0.7850 3.6682

-4.1308 -4.2580 1.6809 5.3816 -2.4532 -2.3259 Important Options for the Matching Methods of PROC PSMATCH Option Feature METHOD= Optimal fixed ratio (OPTIMAL) Optimal variable ratio (VARRATIO) Optimal full matching (FULL) Greedy nearest neighbor matching (GREEDY) Replacement matching (REPLACE) CALIPER= Specifies the criterion for matching EXACT=

Specifies class variables that require exact matching STAT= or DISTANCE= Specifies the distance measure for matching (PS, LPS, or MAH) C o p y ri g h t S A S In st i tu te I n c . A l l ri g h ts re s e rv e d .

Recently Viewed Presentations

  • De Ethiek en Politieke Filosofie van Martha Nussbaum

    De Ethiek en Politieke Filosofie van Martha Nussbaum

    contractualisme: een rechtvaardige maatschappij is er één waarmee vrije mensen achter een sluier van onwetendheid zouden kunnen instemmen in een hypothetische oorspronkelijke positie . Cooperativisme: beroep op RV als wederzijds voordeel.
  • Aucun titre de diapositive - LeWebPédagogique

    Aucun titre de diapositive - LeWebPédagogique

    Title: Aucun titre de diapositive Author: Gilles Gallois Last modified by: gallois gilles Created Date: 6/8/2001 12:26:56 PM Document presentation format
  • High Performance Web Sites 14 rules for faster-loading pages

    High Performance Web Sites 14 rules for faster-loading pages

    14 Rules. Make fewer HTTP requests. Use a CDN. Add an Expires header. Gzip. components. Put stylesheets at the top. Put scripts at the bottom. Avoid CSS expressions
  • Problem Set Assignments/2009  May 12 class: Chapt 1

    Problem Set Assignments/2009 May 12 class: Chapt 1

    Structure "Suffix" Examples "Prefix" . . benzene PAH, PCB Phenyl- Isomers are compounds that have the same molecular formula but are 'different' in some aspect of their structure, eg.
  • Mission Possible: Graduation and Beyond

    Mission Possible: Graduation and Beyond

    Building Resourceful Individuals to Develop Georgia's Economy ... Completing a program WITH skills to support career futures and life styles and also knowing what they want to do is critical for students to be College and Career Ready. Our advisement...
  • Lecture 1 - Digilentinc

    Lecture 1 - Digilentinc

    Additional nodal analysis examples ... review Identify mesh loops The currents around these loops are the mesh currents Use Ohm's Law to write KVL around each loop in terms of the mesh currents Solve these equations to determine the mesh...
  • Using Experience-Based Co-Design (EBCD) for Quality ... - CMHO

    Using Experience-Based Co-Design (EBCD) for Quality ... - CMHO

    Using Experience-Based Co-Design (EBCD) for Quality Improvement to Achieve Better Experiences for Youth and Families: A Simulation Workshop with Lessons from the myCo-Design Project CMHO Conference: November 21, 2016, Toronto
  • Ear Exam - Stritch School of Medicine

    Ear Exam - Stritch School of Medicine

    Physical Diagnosis Scalp, Ears, Nose and Sinuses Describe the patient's scalp. Diagnosis? Psoriasis Normal TM Malleus (short process & handle) Cone of light Incus Pars tensa The unbelievable UMBO Otitis Media Red and bulging Hyperemia Dullness of light reflex Opaque...