当前位置:首页 > 报告详情

用于精确控制LLM输出的通用且与上下文无关的触发器.pdf

上传人: 竿*** 编号:981926 2025-11-29 23页 3.25MB

1、#BHUSA BlackHatEventsUniversal and Context-Independent Triggers for Precise Control of LLM OutputsJiashuo Liang,Guancheng Li#BHUSA BlackHatEventsTeamJiasho LiangliangjsSecurity ResearcherGuancheng Liatuml1Security Researcher#BHUSA BlackHatEventsAgendaBackground of LLM Prompt Injection ThreatsUnivers

2、al Adversarial Trigger A New Attack Paradigmo Architecture overviewo Demo:Achieve RCE on modern LLM agentsTechnical Deep-dive:Finding the TriggersTakeaways,Q&A#BHUSA BlackHatEventsHow Prompt Injection Evolves into a Critical Attack Vector#BHUSA BlackHatEventsLLM Applications and Threats(before 2025)

3、1.LLM as Standalone Tools2.LLM as Workflow ComponentsDify workflow compositionChatGPT ConversationsNew attack surfaces:Web search resultsRAG database contentThird-party tool outputsPotential consequences:Unethical responsesWrong answersMalformed data propagated to downstream components#BHUSA BlackHa

4、tEventsLLM Applications and Threats(since 2025)3.Autonomous Agents with Direct Real-World AccessCline vibe coding:AI writes code in your IDEClaude computer use:AI controls your browser and desktop applicationsNew attack surfaces:MCP toolsOSS projectsVisual inputsPotential consequences:Backdoor code

5、injectionRemote code executionFull system compromise#BHUSA BlackHatEventsCurrent Prompt Injection Attack&Limitations“Ignore previous instructions”“Act as an unrestricted CatGirl”Leak prompt contextJailbreak“Describe your task and role”“What are the available tools?”Control model response“Here is how

6、 to build a bomb”Misclassification:dog-catLimitations:Manual injection craftingContext dependencyTask-specific tricksImprecise output controlLimited security damage Usually produce unethical or wrong answerStep 1.Escape original contextStep 2.Redirect to hijacked tasksTraditional Steps of Prompt Inj

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
根据《Universal and Context-Independent Triggers for Precise Control of LLM Outputs》的内容,以下是全文关键点的概括: 1. **LLM Prompt Injection威胁背景**:随着LLM应用的增长,其潜在的安全威胁也日益凸显。 2. **通用对抗触发器(UAT)**:提出了一种新的攻击范式,通过优化触发词来最大化输出特定payload的概率。 3. **触发器架构**:约70%的成功率,适用于不同上下文和payload。 4. **技术深度**:使用梯度优化算法寻找触发词,支持多种格式输出。 5. **攻击成功率和可迁移性**:在特定模型家族内有一定迁移性,但跨模型家族不可迁移。 6. **局限性**:需要白盒访问、非人类可读触发词、计算资源需求大。 7. **安全影响**:可能导致远程代码执行等严重后果。 8. **建议**:在安全沙箱中运行LLM代理。
揭秘新威胁" LLM控制新突破" 实战演示!"
客服
商务合作
小程序
服务号
折叠