123,123,123

基于多Transformer网络协同生成的自动作曲

信息技术与网络安全 5期

王嵩超，李金龙

(中国科学技术大学计算机科学与技术学院，安徽合肥230026)

摘要： 多音轨的自动作曲算法需要同时兼顾单条序列的连贯性与多个序列之间的和谐程度。以往工作通常选择合并序列或并行多生成器两种方案，它们都无法同时完全捕获音符之间的依赖关系以及做到单条序列的连续性。提出了MuseTransformer框架，其包括由多个Transformer组成的生成器池模块，并设计了多生成器的异步执行策略与同步机制，以确保细粒度依赖关系的捕获。在乐谱的序列表示方面，提出了关键位置符号(Key Position Symbol，KPS)以提高表示效率。多种音乐领域评价指标的实验结果表明，所提模型生成的多轨序列之间在和谐程度、连贯性以及序列表示空间效率上，均等同或优于其他先进方法。

關(guān)鍵詞： 音乐生成序列表示序列模型

中圖分類號(hào)： TP37
文獻(xiàn)標(biāo)識(shí)碼： A
DOI： 10.19358/j.issn.2096-5133.2022.05.008
引用格式：王嵩超，李金龍. 基于多Transformer網(wǎng)絡(luò)協(xié)同生成的自動(dòng)作曲[J].信息技術(shù)與網(wǎng)絡(luò)安全，2022，41(5)：51-58.

Automatic music composition based on multi-Transformer cooperation

Wang Songchao，Li Jinlong

(School of Computer Science and Technology，University of Science and Technology of China，Hefei 230026，China)

Abstract： Multi-track music generation algorithm needs to take account of both coherence on one single track and strong dependencies among multiple tracks. Previous methods either choose to merge multiple sequences into one long sequence, or use multiple generators in parallel, both of which either fail to capture complete dependencies among tokens, or loss single track′s completeness. In this paper，we proposed MuseTransformer, which contains multiple Transformer generators corresponding to each track. In order to capture dependencies among tracks in a fine-grained manner, we designed an asynchronous execution strategy to enable cooperation and synchronization among all generators. In terms of music sequence representation, we designed KPS(Key Position Symbol) to improve the representation efficiency. Experiments on multiple music field metrics show advantages of our model on multi-track harmony, coherence and spatial-compactness, compared to other state-of-the-art methods.

Key words : music generation；sequence representation；sequence model

0 引言

多目標(biāo)序列生成技術(shù)在多軌音樂生成等任務(wù)中有著重要應(yīng)用，這需要同時(shí)確保多個(gè)生成的序列自身的連續(xù)性與序列之間很強(qiáng)的相關(guān)性。本文關(guān)注音樂生成背景下的多序列生成問(wèn)題。現(xiàn)代音樂歌曲通常包含多個(gè)音軌，包括旋律音軌和用于伴奏的多個(gè)樂器音軌。早期的研究[1-2]專注于只有單軌的旋律生成，而最近的工作[3-4]已經(jīng)開始探索多軌音樂生成。在本文中，僅關(guān)注使用基于序列的方法的多軌音樂生成問(wèn)題。

基于序列的方法首先會(huì)將樂譜序列化為一個(gè)或多個(gè)符號(hào)序列，并輸入至序列模型。通常，會(huì)設(shè)計(jì)出類似MIDI協(xié)議的序列格式來(lái)表示一個(gè)單軌音樂序列[1-2，5]。與單軌生成相比，多軌生成任務(wù)需要其生成的軌道具有很強(qiáng)的相關(guān)性，同時(shí)保持其自身的連續(xù)性。

本文詳細(xì)內(nèi)容請(qǐng)下載：http://www.ihrv.cn/resource/share/2000004247

作者信息：

王嵩超，李金龍

(中國(guó)科學(xué)技術(shù)大學(xué) 計(jì)算機(jī)科學(xué)與技術(shù)學(xué)院，安徽合肥230026)

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容