《電子技術(shù)應(yīng)用》
您所在的位置:首頁 > 其他 > 设计应用 > 弹性自组织多集群管理系统设计与实现
弹性自组织多集群管理系统设计与实现
网络安全与数据治理
夏令明, 周俊,赵锋
网络通信与安全紫金山实验室 未来网络研究中心, 江苏南京211111
摘要: Kubernetes等云原生技术在业界应用时,承载能力有限,无法满足更高可用性要求,且易被云供应商锁定;东数西算等战略的实施运行,需以多集群管理技术为基础,但是传统的云管平台难以满足跨多云应用的服务部署和治理的挑战。提出软件定义的自组织基础设施管理、幂等的分层调度新理念,实现以集群为最小单位的弹性基础设施管理架构,将多个Kubernetes集群组成中心式、去中心式、树状等任意拓扑结构,进行应用的跨云调度及管理。方案基于树状集群结构进行了测试验证,并与其他方案对比,测试结果表明该方案能够满足未来分布式云场景下海量集群组织管理需求,且保持接入新集群不超过1 s,应用的调度延迟不超过200 ms。
中圖分類號(hào):TP393文獻(xiàn)標(biāo)識(shí)碼:ADOI:10.19358/j.issn.2097-1788.2023.12.014
引用格式:夏令明,周俊,趙鋒.彈性自組織多集群管理系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)[J].網(wǎng)絡(luò)安全與數(shù)據(jù)治理,2023,42(12):84-89.
Design and implementation of a elastic self organizing multi cluster management system
Xia Lingming, Zhou Jun, Zhao Feng
Future Network Research Center, Network Communication and Security Purple Mountain Laboratory, Nanjing 211111, China
Abstract: When cloud native technologies such as Kubernetes are applied in the industry, their carrying capacity is limited, they cannot meet higher availability requirements, and are easily locked in by cloud providers. The implementation and operation of strategies such as Eastern Data and Western Computing need to be based on multi cluster management technology. However, traditional cloud management platforms cannot meet the challenges of service deployment and governance across multi cloud applications. Aiming at the above problems, this paper puts forward a new concept of softwaredefined selforganizing infrastructure management and idempotent hierarchical scheduling. An elastic infrastructure management architecture with clusters as the smallest unit is designed and implemented, which can make multiple Kubernetes clusters into a multicluster organization scheme with any topology structure such as central, decentralized and tree, and carry out cross cloud scheduling and management of applications. The tree structure is tested and compared with other solutions, which can well meet the huge number clusters organization and management requirements in the future distributed cloud scenario while keep the registration latency of cluster limit to 1 s, scheduler latency limit to 200 ms.
Key words : self organizing infrastructure; distributed cloud; idempotent hierarchical scheduling

引言

單Kubernetes[1]集群無法滿足邊緣、地域、資源管理等需求,因此在東數(shù)西算等典型多集群場景中[2],將不得不解決集群的接入控制、集群資源抽象、權(quán)限管理、應(yīng)用管理、多集群調(diào)度、服務(wù)維持、多租戶以及多集群服務(wù)發(fā)現(xiàn)等問題[3-5],這大大增加了多集群方案的復(fù)雜性和難度。目前社區(qū)和業(yè)界,集群拓?fù)渚愿缸觾蓪蛹軜?gòu)為主,父集群作為主控集群,其余集群為子集群,用于承載工作負(fù)載,其中主流的有Kubefed[6-7]聯(lián)邦方案、Karmada[8]、Clusternet[9]、Admiralty[10]四種。Kubefed和 Karmada是一類,它們通過Template、Overide、Propgation 等定義負(fù)載的通用配置、專有配置和調(diào)度策略。Karmada 自Kubefederation發(fā)展而來,但是支持更豐富的插件化調(diào)度能力以及多集群服務(wù)(Multi cluster service)等特性,Karmada 也順利成為CNCF基金會(huì)孵化項(xiàng)目。但是這二者僅支持中心式的兩層架構(gòu),擴(kuò)展性和承載力都存在理論瓶頸。Clusternet 項(xiàng)目是一個(gè)踐行了OCM模型的多集群方案,也入選了CNCF沙箱項(xiàng)目,子集群通過受控的Token,在子集群啟動(dòng)時(shí),接入到父集群之中。


作者信息

夏令明, 周俊,趙鋒

(網(wǎng)絡(luò)通信與安全紫金山實(shí)驗(yàn)室 未來網(wǎng)絡(luò)研究中心, 江蘇南京211111)


文章下載地址:http://www.ihrv.cn/resource/share/2000005882


weidian.jpg

此內(nèi)容為AET網(wǎng)站原創(chuàng),未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容