通过高保真合成数据促进数据分析.pdf

编号:614218 PDF 28页 9.83MB 下载积分:VIP专享
下载报告请您先登录!

通过高保真合成数据促进数据分析.pdf

1、Boosting Data Analytics ThroughHigh-Fidelity Synthetic DataXiaotong Shenxshenumn.eduSchool of StatisticsUniversity of MinnesotaThe 5th NATIONAL BIG DATA HEALTH SCIENCECONFERENCE,ColumbiaJoint with Yifei Liu and Rex Shen Shen et al.,2023Generative AI and Synthetic DataSynthetic data generation,propel

2、led by generative AI,promotesparadigm shift for data analytics.Synthetic data:artificially created to closely mirror thecharacteristics and distribution of real data.MIT-gartner report Gartner,2022,Eastwood,2023:60%of datautilized in AI and analytics will be synthetically generated by 2024,and synth

3、etic data will surpass real data in AI models by 2030.As synthetic data gains prominence,questions arise concerning ourdata analytics paradigm:(1)how to utilize synthetic data;(2)itsconnection with raw data.Can we benefit from synthetic data for any analytic task?UMN Statistics1/21ExampleFigure 1:Ga

4、o et al.,2023:Machine learning models trained onsynthetic data achieves state-of-art performances compared withreal-data-trained models for medical imaging.UMN Statistics2/21Challenges for Health Care DataTwo importance aspects for healthcare data and medical researchCompliancestorage must be compli

5、ant with regulationsrolebased access control.Efficacy.Data sharing becomes difficulty due to concern of security andprivacy.Focus on the potential impact of generative AI:Can we effectivelyutilize synthetic data to enhance data privacy&efficacy.UMN Statistics3/21OverviewSynthetic data:produced by a

6、generative model to replicate raw data,trainedon raw data via pre-trained models with knowledge transfer from similar studies.Benefits(1)privacy:privacy leakage when sharing real data.(2)scarcity:limited size;expensive trials;time-consuming;imbalance.Generative models:GANs Goodfellow et al.,2014,Kar

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(通过高保真合成数据促进数据分析.pdf)为本站 (patton) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠