当前位置:首页 > 报告详情

异常检测背叛了我们所以我们赋予了它一项新任务:利用良性异常数据增强命令行分类.pdf

上传人: 竿*** 编号:981925 2025-11-29 51页 3.13MB

1、#BHUSA BlackHatEventsAnomaly Detection Betrayed Us,so We Gave It a New Job:Anomaly Detection Betrayed Us,so We Gave It a New Job:Enhancing Command Line Classification with Benign Enhancing Command Line Classification with Benign Anomalous DataAnomalous DataBen Gelman,Sean BergeronIntroduction2About

2、Me-BenData Scientist at Sophos for 4 years35 years in government-funded R&D2 years of post-grad research at academic institutionsAbout Me-SeanDeep personality estimation post-grad research4Data Scientist at Sophos for 3 yearsMechanical engineerWhat Are We Talking About?5How did this happen?6Command

3、linesUnsustainable Manual Effort7The Perfect,Fully-Automated,Self-Updating System for Command Line Prediction,Featuring LLMs8Not Really:Anomaly Detection Betrayed Us9Malicious PrecisionBenign Precision36%100%Motivation10Unsupervised:The State of Anomaly DetectionNo labels requiredHigh scalabilityLow

4、 CostHigh false positive rates extreme alert fatigueReliance on human expertise11ProsConsThe State of Anomaly Detection12Feasible?FPR 2,2,Proportion of upper-case charactersProportion of lower-case charactersASCII per-character countsShannon entropy45Expert Features cont.Count of echomarkersCount of

5、 replace markersCount of#markersCount of markers:o -e,-ec,-enc,-encodedcommand,frombase64string(Count of markers:o,set,&,&for,for%,;Count of markers:o http,www.,.com,html,tcp,udpCount of markers:o lsass,samsrv,hklmsam,winlogon,netlogon,kerberos.dll,dump,.bin,ntdsTest for deliberate encoding and encr

6、yptionCheck for multiple valid file pathsCheck for remote executableCheck for exactly one hostname and local file path46Spark ML FeaturesNormalized tokenso WordPunct tokenize:w+|ws+o Replace numeric digits with*Normalized tokens-TF-IDFNormalized tokens-Compute most common 1024 to

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
全文主要探讨了使用异常检测来增强命令行分类的挑战和解决方案。以下是关键点: 1. **异常检测的局限性**:异常检测在命令行预测中表现不佳,导致高误报率和依赖人工专家。 2. **数据状态**:恶意数据标注困难,而良性数据标注则存在长尾问题。 3. **解决方案**:提出了一种新的方法,结合良性异常数据和监督学习来改进模型。 4. **方法细节**:使用命令行数据集,包括正则表达式和聚合数据,以及嵌入模型和隔离森林算法。 5. **LLM 标注**:利用大型语言模型(LLM)进行良性数据标注。 6. **评估**:通过时间分割和手动标签进行评估。 7. **结论**:良性异常数据增强是提高网络安全模型的一般方法。
如何提升命令行分类?" 新方法突破!" 机器学习新方向?"
客服
商务合作
小程序
服务号
折叠