1、主动学习与样本不均衡在图数据场景的探索周敏 华为云算法创新Lab 主任工程师|自我介绍 本科毕业于中科大,博士毕业于新加坡国立大学 研究方向:图数据、序列数据模式挖掘和学习01Background0304Conclusion目录目录 CONTENTSemantic-aware active learning on graph02Unlabeled Nodes Labeling for imbalanced GraphBackground01|Data available in the form of graphs are ubiquitous.Source from InternetGraph
2、sSocial networksBiology networkAtom networkFinancial networksLogistic networksTelecom networkLink prediction,community detection,node classification,etc.Fraud DetectionGraph Neural Networks are promising tools for fraud detection Label scarce Class imbalanceChallenges in Fraud Detection detectionAct
3、ive Learning on Graphs02|Labels are hard/expensive to collectLabel scarceUnlabelled datalabelled dataPrioritizing the data which needs to be labelled in order to have the highest impact to training a model.Active Learning in Machine LeaningPhoto from interenetPrioritizing the data which needs to be
4、labelled in order to have the highest impact to training a model.Valuable samples-The most informative examples are the ones that the classifier is the least certain about.Active Learning in Machine LeaningPhoto from interenetSelects the most informative nodes as the training labelled nodes based on
5、 the graphical informationDesign different graph-based criteria for node selection on graphs AGE:Uncertainty(entropy)&Representativeness(density¢rality)GRAIN:Influence Maximization&DiversityActive Learning in Graph Machine Leaninghttps:/arxiv.org/pdf/1705.05085.pdfhttps:/arxiv.org/pdf/2108.00219
6、.pdf Mitigating Semantic Confusion from Hostile NeighborhoodSemantic-aware Graph Active Learning Semantic-aware Influence correctionNode influenceSemantic-aware influence1Semantic-aware Graph Active Learning Semantic-aware Influence correctionNode influenceSemantic-aware influence1Semantic-aware Gra