MultiFuzz: A Dense Retrieval-based Multi-Agent System for Network Protocol Fuzzing

作者: Youssef Maklad, Fares Wael, Ali Hamdi, Wael Elsersy, Khaled Shaban

分类: cs.CR, cs.CL, cs.MA, cs.NI

发布日期: 2025-08-19

💡 一句话要点

提出MultiFuzz以解决传统协议模糊测试的有效性问题

🎯 匹配领域: 支柱九：具身大模型 (Embodied Foundation Models)

关键词: 协议模糊测试 多代理系统 密集检索 语义理解 自动化测试 网络安全

📋 核心要点

现有的模糊测试方法在复杂协议的语义理解和变异策略上存在显著不足，导致测试效果不佳。
MultiFuzz通过集成语义感知的上下文检索和多代理协作，动态调整模糊测试策略以提高效果。
实验结果表明，MultiFuzz在分支覆盖率和状态探索深度上显著优于现有的最先进模糊测试器。

📝 摘要（中文）

传统的协议模糊测试技术，如基于AFL的系统，往往由于对复杂协议语法的有限语义理解和僵化的种子变异策略而缺乏有效性。近期的研究，如ChatAFL，已将大型语言模型（LLMs）整合进协议模糊测试中，以应对这些局限性，但仍面临输出不可靠、LLM幻觉及对协议规范知识假设等问题。本文提出了MultiFuzz，这是一种新颖的基于密集检索的多代理系统，旨在通过集成语义感知的上下文检索、专业代理和结构化工具辅助推理来克服这些限制。MultiFuzz利用协议文档（RFC文档）的代理块构建向量数据库中的嵌入，以实现检索增强生成（RAG）管道，使代理能够生成更可靠和结构化的输出，增强模糊测试器在变异协议消息时的状态覆盖和语法约束遵循。实验评估表明，MultiFuzz在分支覆盖率和协议状态及转换的深度探索上显著优于现有的模糊测试器，如NSFuzz、AFLNet和ChatAFL。

🔬 方法详解

问题定义：本文旨在解决传统协议模糊测试在复杂协议语法理解和变异策略上的不足，导致模糊测试效果不理想的问题。现有方法如AFL和ChatAFL在语义理解和输出可靠性上存在局限性。

核心思路：MultiFuzz的核心思路是通过引入密集检索和多代理系统，利用语义感知的上下文信息来指导模糊测试过程，从而提高输出的可靠性和结构性。

技术框架：MultiFuzz的整体架构包括多个模块，首先是从RFC文档中提取代理块，构建向量数据库；其次，通过检索增强生成（RAG）管道，代理能够生成更符合语法约束的协议消息；最后，代理通过链式推理协作，动态调整模糊测试策略。

关键创新：MultiFuzz的主要创新在于其结合了密集检索、代理协调和语言模型推理，形成了一种新的自主协议模糊测试范式，与现有方法相比，显著提升了模糊测试的效果和灵活性。

关键设计：在设计上，MultiFuzz采用了向量数据库来存储协议文档的嵌入，使用特定的损失函数来优化生成的协议消息，同时在代理之间实现了有效的协作机制，以增强模糊测试的效果。

📊 实验亮点

实验结果显示，MultiFuzz在对实时流媒体协议（RTSP）的测试中，分支覆盖率显著提高，探索的协议状态和转换深度超过了现有的最先进模糊测试器，如NSFuzz、AFLNet和ChatAFL，表明其在协议模糊测试中的有效性和优势。

🎯 应用场景

MultiFuzz的研究成果在网络安全领域具有广泛的应用潜力，尤其是在协议模糊测试、漏洞挖掘和网络协议的安全性评估方面。其可扩展性和模块化设计为未来智能代理模糊测试系统的研究提供了坚实的基础，可能推动相关技术的进一步发展。

📄 摘要（原文）

Traditional protocol fuzzing techniques, such as those employed by AFL-based systems, often lack effectiveness due to a limited semantic understanding of complex protocol grammars and rigid seed mutation strategies. Recent works, such as ChatAFL, have integrated Large Language Models (LLMs) to guide protocol fuzzing and address these limitations, pushing protocol fuzzers to wider exploration of the protocol state space. But ChatAFL still faces issues like unreliable output, LLM hallucinations, and assumptions of LLM knowledge about protocol specifications. This paper introduces MultiFuzz, a novel dense retrieval-based multi-agent system designed to overcome these limitations by integrating semantic-aware context retrieval, specialized agents, and structured tool-assisted reasoning. MultiFuzz utilizes agentic chunks of protocol documentation (RFC Documents) to build embeddings in a vector database for a retrieval-augmented generation (RAG) pipeline, enabling agents to generate more reliable and structured outputs, enhancing the fuzzer in mutating protocol messages with enhanced state coverage and adherence to syntactic constraints. The framework decomposes the fuzzing process into modular groups of agents that collaborate through chain-of-thought reasoning to dynamically adapt fuzzing strategies based on the retrieved contextual knowledge. Experimental evaluations on the Real-Time Streaming Protocol (RTSP) demonstrate that MultiFuzz significantly improves branch coverage and explores deeper protocol states and transitions over state-of-the-art (SOTA) fuzzers such as NSFuzz, AFLNet, and ChatAFL. By combining dense retrieval, agentic coordination, and language model reasoning, MultiFuzz establishes a new paradigm in autonomous protocol fuzzing, offering a scalable and extensible foundation for future research in intelligent agentic-based fuzzing systems.

MultiFuzz: A Dense Retrieval-based Multi-Agent System for Network Protocol Fuzzing

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册