123,123

融合深度特征与强化学习的工控协议模糊测试方法

网络安全与数据治理

宗学军1,2,孙俊辉1,2,何戡1,2,史洪岩1,2,连莲1,2,宁博伟2,3

1. 沈阳化工大学信息工程学院; 2. 辽宁省石油化工行业信息安全重点实验室 ; 3. 沈阳工业大学人工智能学院

摘要： 针对工业控制协议漏洞挖掘存在协议语义理解不足、变异策略单一的问题 , 提出一种融合深度特征与强化学习的工控协议模糊测试方法—CTARFuzz。该方法通过 CTCA-Net 模型 , 提取协议结构与上下文特征 , 并引入注意力机制强化关键字段 , 提升测试用例多样性与接收率。结合 Actor-Critic 强化学习模型 , 以 CTCA-Net 模型的输出特征驱动 Actor 网络选变异策略生成用例 , Critic 网络依据设备反馈动态优化策略 , 实现变异策略的自适应优化。实验在典型能源企业工业场景的攻防演练靶场上采用 Modbus TCP、EtherNet/IP 和 S7Comm 协议进行验证 , 结果表明CTARFuzz 异常触发率优于其他方法 , 并拥有较高的接收率与多样性 , 在靶场多个设备中触发异常 , 验证了 CTARFuzz 的适用性与有效性。

關(guān)鍵詞： 模糊测试工业控制协议卷积神经网络强化学习时序卷积网络

中圖分類號 : TP393 文獻(xiàn)標(biāo)志碼 : A DOI :10.19358/j.issn.2097-1788.2026.02.001
中文引用格式 : 宗學(xué)軍 , 孫俊輝 , 何戡 , 等. 融合深度特征與強(qiáng)化學(xué)習(xí)的工控協(xié)議模糊測試方法 [J]. 網(wǎng)絡(luò)安全與數(shù)據(jù)治理 ,
2026 , 45(2) : 1 - 11.
英文引用格式 : Zong Xuejun, Sun Junhui, He Kan, et al. Fuzzing test method for industrial control protocol based on deep feature and rein- forcement learning [J]. Cyber Security and Data Governance, 2026 , 45(2) : 1 - 11.

Fuzzing test method for industrial control protocol based on deep feature and reinforcement learning

Zong Xuejun1 ,2 , Sun Junhui1 ,2 , He Kan1 ,2 , Shi Hongyan1 ,2 , Lian Lian1 ,2 , Ning Bowei2 ,3

1. College of Information Engineering, Shenyang University of Chemical Technology; 2. Liaoning Key Laboratory of Information Security for Petrochemical Industry; 3. College of Artificial Intelligence, Shenyang University of Technology

Abstract： The vulnerability mining of industrial control protocol mainly has the problems of insufficient protocol semantic understanding and single mutation strategy. A fuzzing test method of industrial control protocol-CTARFuzz based on deep feature and reinforcement learning is pro- posed. This method extracts protocol structure and context features through CTCA-Net model, and introduces attention mechanism to strengthen key fields, so as to improve test case diversity and acceptance rate. Combined with the Actor-Critic reinforcement learning model, the output characteristics of the CTCA-Net model are used to drive the Actor network to select the mutation strategy to generate use cases. The Critic net- work realizes the adaptive optimization of the mutation strategy according to the dynamic optimization strategy of the device feedback. The ex- periment is verified by Modbus TCP, EtherNet/IP and S7Comm protocols on the attack and defense drill range of typical energy enterprise in- dustrial scenes. The results show that the abnormal triggering rate of CTARFuzz is better than other methods, and has a high acceptance rate and diversity. It triggers abnormalities in multiple devices in the range, which verifies the applicability and effectiveness of CTARFuzz.

Key words : fuzzing; industrial control protocol; convolutional neural network; reinforcement learning; temporal convolutional network

引言

工業(yè) 控制系統(tǒng) ( Industrial Control Systems, ICS ) 在現(xiàn)代工業(yè)自動化中至關(guān)重要 , 廣泛應(yīng)用于制造業(yè)、電力系統(tǒng) 等領(lǐng) 域[1] 。ICS 通常由可編程邏輯控制器 (PLC) 、分布式控制系統(tǒng) ( DCS ) 、遠(yuǎn) 程終端單元 (RTU) 等 [2] 工控設(shè)備組成 , 設(shè)備之間通過工業(yè)控制協(xié) 議 (Industrial Control Protocol, ICP) 進(jìn)行通信和控制。隨著工業(yè)互聯(lián)網(wǎng)的發(fā)展 , ICS 逐步向開放網(wǎng)絡(luò)架構(gòu)轉(zhuǎn) 型 [3] , 雖提升了系統(tǒng)的互聯(lián)互通能力 , 卻面臨網(wǎng)絡(luò)攻擊威脅。例如 , 2025 年 5 月巴基斯坦對印度發(fā)動大規(guī) 模網(wǎng)絡(luò)攻擊 , 導(dǎo)致印度國家電網(wǎng)工業(yè)控制系統(tǒng)受到攻擊 , 使印度約 70% 的電網(wǎng)癱瘓[4] 。

許多 ICP 設(shè)計之初并未充分考慮網(wǎng)絡(luò)安全問題 , 其固有的脆弱性使得漏洞挖掘成為研究的重點[5] 。模糊測試可通過變異協(xié)議報文并觀察設(shè)備響應(yīng)發(fā)現(xiàn)未知漏洞[6] , 然而 , 傳統(tǒng)的模糊測試在應(yīng)用于 ICP 時面臨著多樣性不足和接收率低等問題[7] 。

近年來 , 深度學(xué) 習(xí)[8] 和強(qiáng) 化學(xué) 習(xí)[9] 在漏洞挖掘領(lǐng)域展現(xiàn)出強(qiáng)大潛力。Cheng 等 [10] 提出 MSFuzz, 利用大型語言模型 ( Large Language Models, LLM) 理解協(xié)議語法結(jié)構(gòu) , 生成符合協(xié)議規(guī)范的測試用例 , 但模型訓(xùn)練依賴有限的協(xié)議樣本。Yang 等[11] 提出 WG- GFuzz, 利用生成對抗網(wǎng) 絡(luò) ( Generative Adversarial Network , GAN) 生成測試用例 , 但過度依賴特定協(xié) 議的格式和狀態(tài)特征且普適性不足。Che 等 [12] 提出了一種基于信息理論的模糊測試方法 , 通過協(xié)議結(jié)構(gòu) 解析算法和基于遺傳算法生成測試用例 , 但對訓(xùn)練數(shù) 據(jù)的質(zhì)量和數(shù)量有一定依賴。Wanyan 等 [13] 提出了基于協(xié)議特征的變異方法 , 利用非關(guān)鍵字段的變異與測試用例組合技術(shù) , 減少了冗余輸入的生成 , 但接收率不足。

當(dāng)前 , 針對工控協(xié)議的特征提取存在一些不足 , 單一的深度學(xué)習(xí)模型不能準(zhǔn)確提取特征。卷積神經(jīng)網(wǎng) 絡(luò) (Convolutional Neural Networks, CNN) [14] 雖能捕捉協(xié)議字段局部組合模式 , 但無法建模長距離時序依賴。時序卷積網(wǎng) 絡(luò) ( Temporal Convolutional Network, TCN) [15] 可通過因果卷積與膨脹卷積覆蓋長時序 , 但對關(guān)鍵語義字段關(guān)注度不足。因此 , 本文通過 CTCA- Net 模型提取特征。不同于單一模型的局限性 , CTCA- Net 采用融合設(shè)計思路 , CNN 捕捉協(xié)議報文的局部結(jié) 構(gòu)特征 , TCN 建立字段間的長時序依賴關(guān)系 , 再通過注意力機(jī)制對關(guān)鍵語義字段進(jìn)行強(qiáng)調(diào) , 最終實現(xiàn)特征提取性能的提升。

綜上 , 本文提出了一種融合深度特征與強(qiáng)化學(xué) 習(xí)的工控協(xié) 議模糊測試方法。本文主要貢獻(xiàn) 概括如下 :

(1) 提出 CTCA-Net 模型提取協(xié)議深層特征 , 解決傳統(tǒng)方法對協(xié)議語義理解不足的問題 , 提升測試用例接收率與多樣性。

(2) 設(shè)計 Actor-Critic 強(qiáng)化學(xué)習(xí)框架 , 實現(xiàn)變異策略自主優(yōu)化 , 解決傳統(tǒng)變異策略單一問題 , 提升測試效率。

(3) 采用 Modbus TCP、EtherNet/IP 和 S7Comm 協(xié) 議評估 CTARFuzz 性能 , 與現(xiàn)有模糊測試方法相比 , CTARFuzz 擁有較高的異常觸發(fā)率 , 驗證了其在不同協(xié) 議與設(shè)備中的適配性及實用性。

本文詳細(xì)內(nèi)容請下載：

http://www.ihrv.cn/resource/share/2000006983

作者信息：

宗學(xué)軍1 ,2 , 孫俊輝1 ,2 , 何戡1 ,2 , 史洪巖1 ,2 , 連蓮1 ,2 , 寧博偉2 ,3

(1. 沈陽化工大學(xué) 信息工程學(xué)院 , 遼寧沈陽 110142 ;

2. 遼寧省石油化工行業(yè)信息安全重點實驗室 , 遼寧沈陽 110142 ;

3. 沈陽工業(yè)大學(xué) 人工智能學(xué)院 , 遼寧沈陽 110870)