1、深圳国家基因库 China National GeneBank深圳国家基因库 China National GeneBankDIF and DIX work in Lustre with LSI Fusion-MPTHomer Li 2023-101深圳国家基因库 China National GeneBankShow the silent data corruptionMedium errorsSSD Nand/Magnetic media/DRAM errorsOS driver bugsHardware firmware issuesTransmission and Receiving
2、errorsShortcomingsPerformance overheadThe implementation and configuration of DIX are relatively complexDIX can only detect and correct some data errors2Why DIF and DIX深圳国家基因库 China National GeneBankDIF/DIX with SCSI device(mpt3sas 44.00.00.00)DIX ApplicationLustreOSDIX/DIF is fully supportedRHEL 7.
3、6 and laterDIF T10 PIHBA driverSCSI device supportLinux block driverDM and MD linearRAID0/1/10ZFS support end-to-end data protection in designLustre crc+zfs checksumOr add a bee watcher to test zfsApplicationOSHBAIO expandersswitchDisk driveDIFDriverFilesystemblockDIXLustreLinuxZFScksumLustre cksumB
4、it flipping without cksum in the flight(Silent err)RAID/EC no much helpCould not replace the full stack protection with the DIF/DIX3Bit flipping in the medium(Silent err)深圳国家基因库 China National GeneBankCONFIG with SCSI deviceFirmware supportSome pages was wrong,make sure with the vendorsg_vpd-page=ei
5、-long/dev/sdapSPT=1 protection types 1 and 2 supportedSome firmware could not be fully support,for more info,please contact vendor of the SCSI devicempt3sas kernel module,LSI Fusion-MPT 93009500insmod mpt3sas.ko prot_mask=0 x7fsg_format-format-size=4096-fmtpinfo=2Provides support of PI protection us
6、ing 10-and 16-byte commands,does not allow the use of 32-byte commandssg_format-format-size 4096-fmtpinfo=3(recommend in production env)Provides checking control and additional expected fields within the 32-byte CDBs RHEL 7.6 and later fully support DIF/DIXAlmaLinux 8.8 x86_64Lustre 2.12 and later/L