《基因组学的效率挑战.pdf》由会员分享,可在线阅读,更多相关《基因组学的效率挑战.pdf(25页珍藏版)》请在三个皮匠报告上搜索。
1、AI Hardware&SystemsaiandsystemsEfficiency Challenges in GenomicsTom Sheffler AI Hardware&SystemsaiandsystemsPreface Goal is to give insights into the characteristics of genomics computations Explain AI/ML on the Edge for DNA processing Challenges in AI/ML for genomics(from the real world)AI Hardware
2、&SystemsaiandsystemsGenomics Applications why does it matter?Cancer Screening identify DNA changes that increase a persons risk guide selection of therapies Whole Genome Sequencing for newborns(Wash Post 2018)6 days old severe seizures 39 hours to sequence whole genome simple treatment identifiedhtt
3、ps:/ Hardware&SystemsaiandsystemsRapidly decreasing cost increasing data and computation Cost for WGS(Whole Genome Sequencing)$300K in 2006 2020$1000$100 Ultima UG 100 Jan 2024https:/www.genome.gov/about-genomics/fact-sheets/Sequencing-Human-Genome-costAI Hardware&SystemsaiandsystemsIntroduction to
4、SequencingAI Hardware&SystemsaiandsystemsSequencing Workflow and AnalysisATGCTACGExtractionTemplateAdapterLigationFragmented DNAFragmentSequencing LibraryLibrary PrepPoolingSequencingAnalysis PipelinePrimaryAnalysisdemuxconsensusvariantcallingconsensusvariantcalling*NAI Hardware&SystemsaiandsystemsE
5、xtractionATGCTACGAI Hardware&SystemsaiandsystemsLibrary PreparationTemplateAdapterLigationFragmented DNAFragmentSequencing Library“ACAC”AI Hardware&SystemsaiandsystemsPoolingPoolLibrariesAI Hardware&SystemsaiandsystemsSequencingPoolDataSequencer100 GB to 1 TB+12 to 48 hoursAI Hardware&Systemsaiandsy
6、stemsData PipelineInherent data parallelism,potential streaming parallelismPrimaryAnalysisdemuxconsensusvariantcallingBaseCalls(500GB+)DemuxedBaseCalls(500GB+)ConsensusReadsVariants(100MB+)consensusvariantcalling*NsensorAI Hardware&SystemsaiandsystemsPrimary AnalysisPrimaryAnalysisdemuxconsensusvari