《Graph-based Causal Inference for Health Decision Making.pdf》由会员分享,可在线阅读,更多相关《Graph-based Causal Inference for Health Decision Making.pdf(40页珍藏版)》请在三个皮匠报告上搜索。
1、Graph-based Causal Inference for Health Decision MakingJundong Li,Assistant ProfessorDepartment of Electrical and Computer Engineering Department of Computer Science,and School of Data Science,University of Virginiahttps:/jundongli.github.io/jundongvirginia.edu 1Graphs are Pervasive in Biology and M
2、edicineFigure reference:Deep Learning for Network Biology-snap.stanford.edu/deepnetbio-ismb-ISMB 2018 2Graph Machine Learning Graph ML has been widely used in different tasks(e.g.,node classification,link prediction,graph classification)Wide a range of applications in the health and biomedical domai
3、n:3classify the function of proteins in the interactomepredict which disease a new molecule might treatunderstand the properties of a particular molecular structureFigures reference:(1)Deep Learning for Network Biology-snap.stanford.edu/deepnetbio-ismb-ISMB 2018Correlation vs.Causation Most of popul
4、ar Graph ML algorithms are built on finding relationships(correlations)from graph data and make use correlations for predictions However,correlation does not imply causation For two correlated events A and B,the possible relations might be:(1)A causes B,(2)B causes A,(3)A and B are consequences of a
5、 common cause,but do not cause each other,etc4Example:does alcohol consumption cause lung cancer?Figures reference:https:/sitn.hms.harvard.edu/flash/2021/when-correlation-does-not-imply-causation-why-your-gut-microbes-may-not-yet-be-a-silver-bullet-to-all-your-problems/Causal Inference on Graphs Cau
6、sal inference studies the causal relations rather than statisticaldependencies between variables Causal effect estimation:assessing the causal effects of a treatment(e.g.,mask wearing)on an outcome(e.g.,disease infection)for one/a group of units Causal effect estimation on graphs Question:Given a co
7、ntact network,how does the usage of face mask influence COVID-19infection risk?TY0(control)or 1(treated)Only one of themcan be observed!orPotential outcomes(disease infection)!Wear maskInfectioncauses5Experimental Study vs.Observational Study Experimental Study Randomized control trial(RCT)Assignmen
8、t of control/treated is random Expensive and time-consuming,e.g.,A/B testing Observational Study Assignment is NOT random Approaches:structural causal models,potential outcome frameworkWe focus on observational data as it can be easily collected in many high-impact domains and RCT is not practical i
9、n many scenarios,especially if the observation data is connected as a graph6Potential Outcome Framework Instance:an independent unit of the population(e.g.,a patient)Treatment:an action(e.g.,medication)that applies to an instance.When the treatment is binary,=1 indicates the instance is in the treat
10、ment group and =0 indicates it is in the control group Potential Outcome:the outcome that would have been observed if the instance had received treatment,i.e.,$!,0,1 Observed Outcome:potential outcome of the treatment that is actually applied,!,#(!is either 0 or 1)Counterfactual Outcome:potential ou
11、tcome of the treatment that the instance had not taken,!,$%#An instance can only take one treatment.Thus,counterfactual outcomes are not observed,which leads to the well-known“missing data”problem7AliceTreatmentPotential Outcome Framework:feature vector of instance (e.g.,EHR)8AlicePotential Outcome
12、Framework:feature vector of instance (e.g.,EHR record)=1:receive the treatment9AliceX XPotential Outcome Framework:feature vector of instance (e.g.,EHR record)=1:take medicine,$%&:bad health outcome10Potential Outcome FrameworkX X“what-if”questions?reason about a world that does not exist!,#$%!,#$&I
13、ndividual Treatment Effect(ITE)!=!,#!$%!,#!$&Average Treatment Effect(ATE)=(!)=(!,#!$%!,#!$&)Observed WorldCounterfactual World:feature vector of instance (e.g.,EHR record)11Challenge#1:Hidden Confounders Confounders influence both the treatment and the outcome.If not handled well,confounders can br
14、ing biases to causal effect estimation However,in many practical scenarios Confounders are often unobserved Confounders may be time-varying12Correlation CausationHard to control for these confoundersExisting methods are mainly based on the unconfoundedness assumptionChallenge#2:Interference between
15、Units Interference:the treatment of an individual may causally affect the outcomeof other individuals Example:whether a person wears a face mask in public may influence theinfection risk of other people In fact,interference is ubiquitous in graph data13However,existing methods are mainly based on th
16、e no-interference assumptionGraph-based Causal InferenceProblem definition:Given:observational data X,A,T,Y node features X graph structureA treatmentT observed outcomesY Goal:Given a graph,a treatment assignment and the outcome,we aim to estimate the individual treatment effect(ITE)for each node i:
17、14Wear maskInfectionTreatmentOutcomecausesITE=_!,%&,%&Q2:How to estimate the causal effect under interference?(C2)Q1:How to estimate the causal effect under hidden confounders?(C1)WWW2022KDD2022(Best Paper Award)Causal Inference under Hidden Confounders15Hidden confounders Z causally affecttreatment
18、 C and outcome Ywear face maskinfectionvigilanceCausal Inference under Hidden Confounders16Hidden confounders Z causally affecttreatment C and outcome Ywear face maskinfectionvigilanceGraph data(node features X and networkstructure A)can be used as proxy variablesfor hidden confounders Zwear face ma
19、skinfectionvigilancewebsearchesphysicalcontactSimilar nodes are connected more often than dissimilar nodes(homophily)Causal Inference under Hidden Confounders17Hidden confounders Z causally affecttreatment C and outcome Ywear face maskinfectionvigilanceGraph data(node features X and networkstructure
20、 A)can be used as proxy variablesfor hidden confounders Zwear face maskinfectionvigilancewebsearchesphysicalcontactTimett+1Historical data(previousconfounders Zt,treatment Tt,outcome Yt)can influencecurrent confounders Zt+1Dynamic EnvironmentKey Idea of Tackling Hidden Confounders Motivation:Hidden
21、confounders often lead to biased causal effectestimation Key idea:Capture hidden confounders and the evolution patternsthrough representation learning from dynamic graph data18TimeConfounderrepresentationsITE=_ Graph data can be proxyvariables for hiddenconfounders Effective deeplearning techniquesa
22、re utilized Estimate ITE basedon confounderrepresentations!,%&,%&Formulation19ITE of instance if at time stamp features of instance at time stamp graph structure at time stamp historical data&,&,&$=,$!%&$,$,$,$!%/$,$,$Problem Statement Given:(1)observational data$,$,$%&across time stamps,and(2)graph
23、 adjacency matrix$%&across time stamps Estimate:individual treatment effect of each instance($for each instance at eachtime stamp potential outcome ofinstance if treatment=1at time stamp Theorem:If we recover&,&,&and(&|&,&),then the proposed method can recover ITE for each nodeProposed Framework-Ove
24、rview20Dynamic graphsConfounder representationsProposed Framework-Details21 Confounder representationlearning%&=(&,=&(!)%,&)HistoryembeddingGraphstructureGraph neuralnetwork!,%&=!(%&),%&=(%&)Prediction for potential outcomesand treatment Representation balancing Help reduce the biases in ITE estimat
25、ionITE=_=C)*+outcome prediction losstreatment prediction lossBalancing loss Overall loss=1=0distribution of confounder representationsEvaluation:A Case Study on COVID-19 PoliciesThe outbreak of COVID-19 has been affecting public health since 2019Various non-pharmaceutical public policies have been a
26、nnounced to limit impact of COVID-19 across the US(e.g.,social distancing,mask requirement,travel restrictions)227.51012.5157.51012.515Total cases per million people,in thousandsTotal cases per million people,in thousandsEvaluation:A Case Study on COVID-19 PoliciesTo help future policy makers,a natu
27、ral question is:which policy is more effective to control the impact of COVID-19?Specifically,we study the causal effect of different public policies(treatment)on the outbreak dynamics(outcome)237.51012.5157.51012.515Total cases per million people,in thousandsTotal cases per million people,in thousa
28、ndsCausationData Collection:An Overview 24To study causal effects,we take each county in the U.S.as a unit,and collect:Treatment:Whether a certain policy is in effect(1 or 0)in different counties.Outcome:The number of confirmed cases and death cases in different counties.To control for unobserved co
29、nfounders,we collectFeatures(covariates):data that reflect confounders(e.g.,residents vigilance)in counties Graphs:relational information among counties,e.g.,distance network/mobility flowWe assume these features and networks are correlated with the unobserved confoundersConfounderTreatmentOutcomeCo
30、nfounderTreatmentOutcomeFeaturesGraphCausal Assessment of Public Policies25Observations:The policies about social distance and mask have positive causal impact in reducing the number of confirmed casesThe policies about reopening have negative causal impact in reducing the number of confirmed casesa
31、)Social distanceb)Reopeningc)MasktimeBetter for reducing the spread of COVID-19No causal effectMore harmful for reducing the spread of COVID-19Confounder Control Our method can achieve both high performance in outcome and treatment prediction,which implicitly indicates that our method can capture th
32、e confounders26Prediction of the confirmed/death cases and policy assignment by different methods.Existing methodsVariants of our methodOur methodThe lower the betterThe higher the betterHigh-order Interference Interference(spillover effect):the treatment of an individual maycausally affect the outc
33、ome of other individuals Both pairwise and high-order interference exists in real-world scenarios Whether a person wears a face mask in public may influence the infection risk ofother people pairwise interference The face covering practice of other individuals may physically contact the target indiv
34、idual through a gathering event high-order interference In hypergraphs,each hyperedge can connect an arbitrary number of nodes,in contrast to an ordinary edge which connects exactly two nodes27The interaction between 2 and 3 may also influence the exposure of the virus to 1,i.e.,2 3 1Causal Inferenc
35、e Under High-order Interference Motivation:High-order interference exists in hypergraphs Given:observational data,denoting features,hypergraph,treatment,and observed outcomes Goal:estimate the ITE for each individual(node)i:28Treatment of other nodesPotential outcome when Ti=1 or Ti=0Assumptions Ass
36、umption 1.For any node,given the node features X,the potential outcomes are independent with the treatment assignment and summary of neighbors Assumption 2.For any node,any values of,and ,if the output of asummary function o=SMR,)(,)(is determined,then the values of the potential outcomes with featu
37、re X are also determined29Theory(Identifiability):the defined ITE can be identifiablefrom observational data under the assumptionsa function which characterizes all the“environmental”information related to node i.Proposed Framework30Input hypergraphProposed Framework31Confounder representation learn
38、ing:encode the into a latent space to capture the confoundersAssumption 1:A weaker version of unconfoundedness assumptionProposed Framework32Interference Modeling:capture the high-order interference for each individual through representation learningPropagate the neighboring treatment assignment and
39、 confounder representations with a hypergraph moduleAssumption 2(Expressiveness of summary function)Proposed Framework33Interference Modeling:capture the high-order interference for each individual through representation learningHypergraph convoluntional network is applied in the hypergraphmodule:va
40、nilla Laplacian matrix for the hypergraphInterference representation in +1-th layerProposed Framework34Interference Modeling:capture the high-order interference for each individual through representation learning Modeling interference with different significance E.g.,active individuals may be more l
41、ikely to influence or be influenced by others Attention mechanism in hypergraph moduleProposed Framework35Outcome Prediction:predict the potential outcomes based onlearned representationsOutcome prediction lossBalancing lossModel parameterregularizationEvaluation Use real-world node features,hypergr
42、aph Simulate potential outcomes Linear setting:Quadratic setting:36Hyperedges which contain node iNodes in hyperedgeStrength of interferenceEvaluation Key question:how does ones face covering practice(treatment)causally affect their infection risk of an infectious disease(outcome)?In each group cont
43、act,one may bring the virus to the surrounding environment,and thus affect other peoples infection risk.Observation:the proposed framework outperforms all the baselines under both linear and quadratic outcome simulation settings37Outcome simulation settingsOur methodNo graphProjected graphThe smalle
44、r,The betterEvaluation Main observations:Interference.Our framework outperforms all the baselines,especially under higherinterference(larger)Hypergraph.The high-order relational knowledge can help causal effect estimation38Projected graph,HGNNProjected graph,GNNNo balancingStronger interferenceConcl
45、usion Background:Learning causality plays a central role in many health-related decision-making scenarios Problem:We study novel problems of causal effects learning with graph information Methodologies:We propose graph-based solutions to capture(1)the influence hidden/evolving confounders;and(2)high
46、-order interference among units on a graph Results:Experiments and simulations on health-related datasets validate our hypothesis that graph information helps relax stringent causal assumptions and when estimating causal effects for decision making39Empower Graph Machine Learning for Trustworthy Decision MakingQ&A40Contact:Jundong Lihttps:/jundongli.github.io/jundongvirginia.edu Acknowledgements