123,123,123

基于注意力特征金字塔的轻量级目标检测算法

2021年电子技术应用第10期

赵义飞，王勇

北京工业大学信息学部，北京100124

摘要： 基于深度学习的目标检测算法因其模型复杂度和对计算能力的要求，难以部署在移动设备等低算力平台上。为了降低模型的规模，提出一种轻量级目标检测算法。该算法在自顶向下的特征融合的基础之上，通过添加注意力机制构建特征金字塔网络，以达到更细粒度的特征表达能力。该模型以分辨率为320×320的图像作为输入，浮点运算量只有0.72 B，并在VOC数据集上取得了74.2%的mAP，达到了与传统单阶段目标检测算法相似的精度。实验数据表明，该算法在保持了检测精度的同时显著降低了模型运算量，更适合低算力条件下的目标检测。

關(guān)鍵詞： 目标检测特征金字塔注意力机制轻量级算法

中圖分類號： TN98；TP391
文獻標識碼： A
DOI：10.16157/j.issn.0258-7998.211320
中文引用格式： 趙義飛，王勇. 基于注意力特征金字塔的輕量級目標檢測算法[J].電子技術(shù)應(yīng)用，2021，47(10)：33-37.
英文引用格式： Zhao Yifei，Wang Yong. Lightweight object detection algorithm based on attention feature pyramid network[J]. Application of Electronic Technique，2021，47(10)：33-37.

Lightweight object detection algorithm based on attention feature pyramid network

Zhao Yifei，Wang Yong

Faculty of Information Technology，Beijing University of Technology，Beijing 100124，China

Abstract： Object detection algorithms based on deep learning are difficult to deploy on low computing power platforms such as mobile devices due to their complexity and computational demands. In order to reduce the scale of the model, this paper proposed a lightweight object detection algorithm. Based on the top-down feature fusion, the algorithm built a feature pyramid network by adding an attention mechanism to achieve more fine-grained feature expression capabilities. The proposed model took an image with a resolution of 320×320 as input and had only 0.72 B FLOPs, achieved 74.2% mAP on the VOC dataset and the accuracy is similar to traditional one-stage object detection algorithms. Experimental data shows that the algorithm significantly reduces the computational complexity of the model, maintains the accuracy, and is more suitable for object detection with low computing power.

Key words : object detection；feature pyramid；attention mechanism；lightweight algorithm

0 引言

目標檢測是計算機視覺的關(guān)鍵組成部分之一，旨在探索統(tǒng)一框架下人類視覺認知過程的模擬和行人檢測、人臉識別、文本檢測等特定應(yīng)用場景下視覺任務(wù)的完成。2012年，Krizhevsky等^[1]提出的AlexNet將卷積神經(jīng)網(wǎng)絡(luò)應(yīng)用在了圖像分類算法之中并取得了驚人的效果，從此基于深度學(xué)習(xí)的卷積神經(jīng)網(wǎng)絡(luò)算法開始取代傳統(tǒng)的基于人工特征的算法，成為了計算機視覺領(lǐng)域的主流研究方向。

目前基于深度學(xué)習(xí)的目標檢測算法可分為單階段檢測算法和兩階段檢測算法兩類。單階段目標檢測算法以SSD^[2]和Yolo^[3-5]系列算法為代表，是一種通過在卷積神經(jīng)網(wǎng)絡(luò)提取的特征圖上設(shè)置錨點，并對每個錨點上預(yù)設(shè)的不同大小和長寬比例的邊界框進行檢測的方法。兩階段目標檢測算法以RCNN^[6-8]系列算法為代表，先在特征圖上采用額外步驟生成候選區(qū)域，再對候選區(qū)域進行檢測。與單階段算法相比，兩階段算法一般擁有更高的檢測精度，但由于增加了額外的運算量，檢測速度也相對較低。

本文詳細內(nèi)容請下載：http://www.ihrv.cn/resource/share/2000003778。

作者信息：

趙義飛，王勇

(北京工業(yè)大學(xué) 信息學(xué)部，北京100124)

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容