《没有环境标签的不变性学习的若干问题探讨.pdf》由会员分享,可在线阅读,更多相关《没有环境标签的不变性学习的若干问题探讨.pdf(26页珍藏版)》请在三个皮匠报告上搜索。
1、ZIN:When and How to Learn Invariance withoutDomain PartitionYong LinOctober 18,2023(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,20231/24Contents1Learning Invariance without Environment Indexes2References(Yong Lin)ZIN:When and How to Learn Invariance without Domai
2、n PartitionOctober 18,20232/24IntroductionThe common i.i.d(independent and identically distributed)assumption does not always hold.In many real world applications,we may encounter novel testingdistribution different from the training one.Known as the out-of-distribution generalization(OOD)problem.(Y
3、ong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,20233/24A Motivating Example(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,20234/24Invariance in CausalityLemma(Invariance Property)The conditional distribution of Y given the the direc
4、t causes will notchange when we intervene on any other node except for Y.Figure:Images taken from Peters et al.,2016.The conditional EY|X2,X4remains invariant under each possible intervention on nodes except for Y.Invariant Causal Prediction(ICP)Peters et al.,2016 first proposes toutilize the invari
5、ance property to identify Ys parent.(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,20235/24Invariant Risk Minimization(IRM)Definitions:Invariant Features Xinv:Direct cause of Y,i.e.,X2,X4in the formerexample.Spurious Features Xs:Features other than direct cause.IRM
6、 seeks to learn an representation(X)to exclusively rely on invariantfeatures.(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,20236/24MotivationIRM requires sufficient environments to learn invariance.However,given acollected dataset(that may contains a mixture of en
7、vironments),how canwe divide it into environments?Some works try to infer environments automatically Creager et al.,2021Liu et al.,2021a Liu et al.,2021b.However,we provide acounterexample to show this is generally impossible without moreinformation Lin et al.,2022.1.1Yong Lin,Shengyu Zhu,et.al.,ZIN
8、:When and How to Learn Invariance.,Arxiv(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,20237/24The Impossibility ResultSuppose(X1,X2,Y)0,13,we observe a mixture of distribution fromtwo environments as following:Y=0,w.p.0.5,Y=1,w.p.0.5,X2=X1=Y,w.p.0.4,X2=X1=Y,w.p.0.
9、1,X1=X2=Y,w.p.0.4,X2=X1=Y,w.p.0.1.(1)This distribution can be generated by either of the following process:(a)X1is the invariant feature while X2is spurious,where pe=1s=0.7 andpe=2s=0.9:X1 B(0.5),Y=?X1,w.p.0.5,1 X1,w.p.0.5,Xe2=?Y,w.p.pes,1 Y,w.p.1 pes,(b)X2is the invariant feature while X1is spuriou
10、s where pe=1s=0.25 andpe=2s=0.75:X2 B(0.5),Y=?X2,w.p.0.8,1 X2,w.p.0.2,Xe1=?Y,w.p.pes,1 Y,w.p.1 pes,(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,20238/24ZIN:auxiliary information Z for environmental INferenceNow there are two lines of research:IRM and its variants
11、 need explicit environments indexes,notapplicable in many applications.General environmental inference is impossible without moreinformation.We propose the framework ZIN:ZIN can identify invariant features based on additional auxiliaryvariables that encodes some information about the latentheterogen
12、eity.2.2Yong Lin,Shengyu Zhu,et.al.,ZIN:When and How to Learn Invariance.,Arxiv(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,20239/24ZINWe consider there exists additional auxiliary information Z Rdzincompany with the data(X,Y).ZIN solves the following problem:min
13、w,max,wjL(,w,w1,.,wK,)=R(w,)+XKj=1?R(j)(w,)R(j)(wk,)?|zinvariance penalty(2)where K Z is a pre-specified hyper-parameter,():Rdz RKwith(j)denoting its j-th entry,R(w,)=E(fw(x),y),R(j)(w,)=E(j)(z)(fw(x),y).(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202310/24The r
14、equirement for ZStill consider a feature selection problem X=Xinv,Xs and=0,1dinv+ds.Condition(Invariance Preserving Condition)For invariant feature Xinvand any function(),it holds thatH(Y|Xinv,(Z)=H(Y|Xinv).Condition(Non-invariance Distinguishing Condition)For any feature Xks Xs,there exists a funct
15、ion()and a constantC 0 such that H(Y|Xks)H(Y|Xks,(Z)C.We remark that Condition 1 can be satisfied ifH(Y|Xinv,Z)=H(Y|Xinv),which is Z Y|Xinv.Condition 2 requiresP(Y|Xks)=P(Y|Xks,(Z)and also Z Y|Xs.(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202311/24Identifiabili
16、ty of ZINTheorem(Identifiability)Assumes that given a feature mask and any constant 0,there existsf F such that E(f(X),Y)H(Y|(X)+.If Conditions 1-2hold,C4+2CH(Y)and H(Y)+1/2CC412,412,then we haveL(inv)L(),=inv.Thus,the solution to Problem 2 identifies invariant features.We further show that Conditio
17、ns 1-2 are almost necessary conditions,anyviolation leads to the failure of identifiability Lin et al.,2022.(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202312/24Example Of ZFigure:Illustration of Z satisfying Y Z|Xinv,where Xinv=X1,X2(Yong Lin)ZIN:When and How t
18、o Learn Invariance without Domain PartitionOctober 18,202313/24Experiments:Synthetic DatasetWe simulate temporal heterogeneity with distributional shift w.r.t.time.Let t 0,1 be time index and Xinv(t)R the invariant feature.Thedata generation process isXinv(t)?N(1,2),w.p.0.5,N(1,2),w.p.0.5,Y(t)?sign(
19、Xinv(t),w.p.pv,sign(Xinv(t),w.p.pv,Xs(t)?N(Y(t),2),w.p.ps(t),N(Y(t),2),w.p.ps(t).where pvis a constant w.r.t.t,indicating a stable correlation betweenY(t)and Xv(t),pv=1 pv,and ps(t)=1 ps(t).Figure:Illustration(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202314/24
20、Experiments:Synthetic DatasetFigure:Illustration of the Synthetic dataset(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202315/24Experiments:Synthetic DatasetEnv Partitionps(t)0.999,0.70.999,0.80.999,0.9pv0.90.80.90.80.90.8Test AccMeanWorstMeanWorstMeanWorstMeanWor
21、stMeanWorstMeanWorstNoERM75.3757.3159.6525.8168.7241.9755.9015.0760.6123.3952.857.57EIIL38.4116.8064.8949.1550.7746.6768.3656.3561.9953.8170.1059.36HRM50.0049.9949.9849.9350.0049.9850.0149.9950.0049.9849.9949.97ZIN87.5085.3677.8575.3986.3582.9176.7972.7783.7175.8973.5564.69YesIRM87.5785.4777.9975.65
22、86.5783.2577.0073.3983.9976.4873.8465.33Table:Test Mean and Worst accuracy(%)on four temporal heterogeneitysynthetic datasets.(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202316/24Experiments:Synthetic DatasetEnv Partitionps(t)0.999,0.999,0.7,0.70.999,0.9,0.8,0.7
23、0.999,0.999,0.8,0.8pv0.90.80.90.80.90.8Test AccMeanWorstMeanWorstMeanWorstMeanWorstMeanWorstMeanWorstNoERM76.6559.4860.3327.2576.5959.3560.3027.2569.9344.6556.2316.60EIIL37.8116.8966.4650.3537.0314.0166.6050.2870.1845.2371.0058.72HRM49.9849.9549.9749.9249.9949.9850.0049.9949.9749.9549.9949.97ZIN88.6
24、687.2379.1678.0488.2886.2978.9277.4988.0085.7578.8077.25YesIRM83.7182.2473.2671.2586.7383.7975.8073.3384.3981.4873.1569.97Table:Test Mean and Worst accuracy(%)on four spatial heterogeneity syntheticdatasets.(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202317/24Ex
25、periments:Real World DatasetDatasetXXinvXsYZTrainingTestingSynthetic 1dX1,X2X1X2Ytcorr(X2,Y)=(0.9,0.8)corr(X2,Y)=0.1Synthetic 2dX1,X2X1X2Y(t1,t2)(0.9,0.9,0.8,0.8)corr(X2,Y)=0.1House PriceHouse Feature-Price RankingYear1900-19501950-2000Landcover-Cover TypeCoordinateAfricaOutside AfricaCelebAImage-Ge
26、nderSmileYoung,Blond Hair,Eyeglasses,.Table:Datasets for ZIN(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202318/24Experiments:Real World DatasetTable:House price prediction(MSE).MethodTrainTest MeanTest WorstERM0.10070.35970.4968EIIL0.68410.96251.3909EIIL(+Z)0.69
27、120.97011.4201HRM0.34660.35210.5248HRM(+Z)0.31900.37640.5726ZIN0.22750.31180.4285IRM0.11120.33280.4913Table:Accuracy(%)on CelebA task.MethodTrainTest MeanTest WorstERM90.970.6670.760.2647.580.46LfF59.890.7252.970.5644.382.01EIIL44.785.8555.261.1743.128.98ZIN(1)90.620.7870.790.6147.620.98ZIN(4)83.571
28、.4075.200.7163.471.41ZIN(7)83.061.2876.290.6067.271.15IRM81.301.5378.440.4875.031.29(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202319/24Experiments:Real World DatasetMethodIID TestOOD TestERM Xie et al.,202075.9258.31ERM(+climate)Xie et al.,202076.5854.78LfF49.
29、9048.65EIIL71.5164.37ZIN(climate)72.3563.45ZIN(location)72.1666.10Table:Test Accuracy(%)on Landcover task.3.3Yong Lin,Shengyu Zhu,et.al.,ZIN:When and How to Learn Invariance.,Arxiv(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202320/24Takeaway:how to chose ZCollec
30、t as much Z as possible which satisfies the invariancepreserving condition.Figure:Examples of Feasible Z satisfying Invariance Preserving Condition(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202321/24Takeaway:how to chose ZFigure:Examples of Infeasible Z violati
31、ng Invariance Preserving Condition.(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202322/24Ablation study on ZZCondition 1Condition2Test MeanTest Worst(t1,t2)78.8077.25t256.3016.84t178.7977.21(X1,X2)59.8525.99(X1,X2,Y)71.0958.85Table:Ablation study on choice of Z.(
32、Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202323/24Contents1Learning Invariance without Environment Indexes2References(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202324/24Creager,E.,Jacobsen,J.-H.,&Zemel,R.(2021).Environmen
33、t inference for invariant learning.In International Conference on Machine Learning(pp.21892200).:PMLR.Lin,Y.,Zhu,S.,&Cui,P.(2022).Zin:When and how to learn invariance by environment inference?arXiv preprint arXiv:2203.05818.Liu,J.,Hu,Z.,Cui,P.,Li,B.,&Shen,Z.(2021a).Heterogeneous risk minimization.ar
34、Xiv preprint arXiv:2105.03818.Liu,J.,Hu,Z.,Cui,P.,Li,B.,&Shen,Z.(2021b).Kernalized heterogeneous risk minimization.In NeurIPS.Peters,J.,B uhlmann,P.,&Meinshausen,N.(2016).Causal inference by using invariant prediction:identification andconfidence intervals.(Yong Lin)ZIN:When and How to Learn Invaria
35、nce without Domain PartitionOctober 18,202324/24Journal of the Royal Statistical Society.Series B(StatisticalMethodology),(pp.9471012).Xie,C.,Chen,F.,Liu,Y.,&Li,Z.(2020).Risk variance penalization:From distributional robustness to causality.arXiv preprint arXiv:2006.07544.(Yong Lin)ZIN:When and How to Learn Invariance without Domain PartitionOctober 18,202324/24