基于深度强化学习的蜂窝连接无人机轨迹优化方法及仿真
电子技术应用
李胜昌1,张振峡1,王乐和1,吴佳华2,于果2,孟祥禄2
1.中国兵器工业计算机应用技术研究所;2.大连理工大学
摘要: 蜂窝连接无人机(cellular-connected UAV)在5G网络中展现出巨大潜力。针对其在执行通信任务时需保持稳定地面基站连接的问题,研究了无人机轨迹优化。目标是在给定区域内,使无人机从起点到终点的飞行过程中联合最小化任务完成时间与通信中断时间,并最大化通信吞吐量。由于该问题具有非凸性,采用基于多步学习的竞争性双深度Q网络(Dueling Double Deep Q Network, D3QN)算法求解,通过无人机与地面基站的交互学习实现自适应轨迹优化。仿真结果表明,与直飞策略相比,该方法将任务完成时间缩短了28%,通信中断时长降低了42%,系统平均吞吐量提升了35%,在任务效率、通信稳定性和系统吞吐量等方面均实现显著优化。
中圖分類號:TN929.53 文獻標志碼:A DOI: 10.16157/j.issn.0258-7998.257391
中文引用格式: 李勝昌,張振峽,王樂和,等. 基于深度強化學習的蜂窩連接無人機軌跡優(yōu)化方法及仿真[J]. 電子技術應用,2026,52(4):29-37.
英文引用格式: Li Shengchang,Zhang Zhenxia,Wang Lehe,et al. A deep reinforcement learning-based trajectory optimization method and simulation for cellular-connected UAVs[J]. Application of Electronic Technique,2026,52(4):29-37.
中文引用格式: 李勝昌,張振峽,王樂和,等. 基于深度強化學習的蜂窩連接無人機軌跡優(yōu)化方法及仿真[J]. 電子技術應用,2026,52(4):29-37.
英文引用格式: Li Shengchang,Zhang Zhenxia,Wang Lehe,et al. A deep reinforcement learning-based trajectory optimization method and simulation for cellular-connected UAVs[J]. Application of Electronic Technique,2026,52(4):29-37.
A deep reinforcement learning-based trajectory optimization method and simulation for cellular-connected UAVs
Li Shengchang1,Zhang Zhenxia1,Wang Lehe1,Wu Jiahua2,Yu Guo2,Meng Xianglu2
1.Computer Application Technology Research Institute of China North Industries Group Corporation Limited;2.Dalian University of Technology
Abstract: Cellular-connected unmanned aerial vehicles (UAVs) show great potential in 5G networks. To address the challenge of maintaining stable connections with ground base stations during communication tasks, this paper investigates an UAV trajectory optimization problem. The objective is to jointly minimize the task completion time and communication outage duration while maximizing communication throughput as the UAV travels from the starting point to the destination within a given area. Considering the non-convex nature of the problem, a multi-step learning-based Dueling Double Deep Q Network (D3QN) algorithm is adopted to achieve adaptive trajectory optimization through interactive learning between the UAV and ground base stations. Simulation results show that, compared with the direct flight strategy, this method reduces the task completion time by 28%, cuts down the communication outage duration by 42%, and increases the average system throughput by 35%, achieving significant improvements in task efficiency, communication stability and system throughput.
Key words : unmanned aerial vehicle;cellular communications;deep reinforcement learning;trajectory optimization;dueling double deep Q-network
引言
近年來,無人機(Unmanned Aerial Vehicle, UAV)憑借其高機動性與靈活部署能力,在諸多領域迅速發(fā)展[1][2]。在無人機的各類應用場景中,保障其飛行安全與提升任務完成時效至關重要,而這兩者均依賴于穩(wěn)定且高質量的空地通信鏈路。為此,蜂窩連接無人機(cellular-connected UAV)作為無人機與地面基站之間的重要無線連接方式,受到廣泛關注[3]。與通信距離受限的傳統(tǒng)WiFi方案相比,蜂窩連接不受短距約束[4],可實現(xiàn)真正意義上的遠程控制;相較于覆蓋范圍廣的衛(wèi)星通信,蜂窩連接無人機亦具有顯著成本優(yōu)勢。尤其是在第五代移動通信技術[4]的支撐下,蜂窩連接無人機有望顯著提升空地鏈路的速率與可靠性。近年來,蜂窩連接無人機已在搜救、空中監(jiān)測、交通管控、航拍以及包裹投遞等領域獲得成功應用。
本文詳細內容請下載:
http://www.ihrv.cn/resource/share/2000007034
作者信息:
李勝昌1,張振峽1,王樂和1,吳佳華2,于果2,孟祥祿2
(1.中國兵器工業(yè)計算機應用技術研究所,北京 100083;2.大連理工大學,遼寧 大連 116000)

此內容為AET網(wǎng)站原創(chuàng),未經(jīng)授權禁止轉載。
