基于BERT的提示学习实现软件需求精确分类
信息技术与网络安全 2期
罗贤昌,薛吟兴
(中国科学技术大学 计算机科学与技术学院,安徽 合肥230026)
摘要: 软件需求是用户对软件效用的直接回馈, 实现对软件需求工程精确分类可大幅降低维护成本并显著加快软件开发维护的流程。使用传统的基于机器学习分类方法(如逻辑回归、支持向量机以及K近邻算法),或简单地应用BERT(Bidirectional Encoder Representation from Transformers)模型都不能很好地利用软件需求PROMISE数据集样本,最终表现为通用性差或分类效率低。为了增强BERT模型对自然语言文本的语义理解能力,应用提示学习的思想,将K分类选择问题转化为二分判断问题。实验结果表明,无需对不均衡的数据集执行样本均衡策略,模型分类性能便远优于上述两种分类工作,获得最佳的预测结果。
中圖分類號: TP183
文獻標識碼: A
DOI: 10.19358/j.issn.2096-5133.2022.02.007
引用格式: 羅賢昌,薛吟興. 基于BERT的提示學習實現(xiàn)軟件需求精確分類[J].信息技術(shù)與網(wǎng)絡(luò)安全,2022,41(2):39-45.
文獻標識碼: A
DOI: 10.19358/j.issn.2096-5133.2022.02.007
引用格式: 羅賢昌,薛吟興. 基于BERT的提示學習實現(xiàn)軟件需求精確分類[J].信息技術(shù)與網(wǎng)絡(luò)安全,2022,41(2):39-45.
Accurately classify software requirements using prompt learning on BERT
Luo Xianchang,Xue Yinxing
(Department of Computer Science and Technology,University of Science and Technology of China,Hefei 230026,China)
Abstract: Software requirement is a direct feedback from users to software utility. The accurate classification of software requirements engineering can greatly reduce maintenance costs and significantly speed up the process of software development and maintenance. Traditional machine learning-based classification methods(such as logistic regression, support vector machines, and K-nearest neighbor algorithms) or simply applying BERT(Bidirectional Encoder Representation from Transformers) models cannot learn to make the most use of the PROMISE data set for software requirements, and ultimately appear to be poor generalization or low classification efficiency. In order to enhance the BERT model′s ability to understand the semantics of natural language texts, this paper applies the idea of prompt learning to transform the K classification selection problem into a binary judgment problem. The experimental results show that there is no need to implement a sample equalization strategy for unbalanced data sets. The classification performance of this model is far superior to the above two classification tasks, and the best prediction results are finally obtained.
Key words : software requirement;accurately classify;bidirectional encoder representation from transformer;prompt learning
0 引言
軟件需求是用戶對軟件效用最直觀的反饋之一,常包含用戶體驗、功能需求以及質(zhì)量問題等內(nèi)容。軟件需求一般可分為功能需求與非功能需求,前者主要是對軟件系統(tǒng)的服務(wù)、函數(shù)行為的描述,而后者往往涉及可靠性、可用性、安全性、隱私性或軟件權(quán)限等非功能問題。隨著互聯(lián)網(wǎng)技術(shù)的飛速發(fā)展,各種客戶端和移動端的應(yīng)用數(shù)量急速增加,截至2021年11月,蘋果應(yīng)用商店就在全球上架了含40多種語言、超180萬種的應(yīng)用軟件,各應(yīng)用的用戶評論更是爆炸式增長。由此可見,應(yīng)對超大規(guī)模軟件需求工程問題已經(jīng)刻不容緩,實現(xiàn)軟件需求的自動分類可以大幅降低人工分類的工作壓力、成本與誤差,能快速地精確分析最新鮮最實際的用戶體驗反饋,進而高效確定改進方向,顯著加速軟件開發(fā)維護流程,極大地提升用戶體驗。
本文詳細內(nèi)容請下載:http://www.ihrv.cn/resource/share/2000003949
作者信息:
羅賢昌,薛吟興
(中國科學技術(shù)大學 計算機科學與技術(shù)學院,安徽 合肥230026)

此內(nèi)容為AET網(wǎng)站原創(chuàng),未經(jīng)授權(quán)禁止轉(zhuǎn)載。
