張家綺 游科
摘要
針對(duì)網(wǎng)絡(luò)化多智能體的分布式優(yōu)化問(wèn)題,本文討論一種只利用鄰居相對(duì)狀態(tài)的符號(hào)信息的分布式算法.該算法不要求與圖相關(guān)的權(quán)重矩陣是雙隨機(jī)矩陣.首先利用優(yōu)化理論中的懲罰函數(shù)法解釋該算法,然后分析算法在靜態(tài)圖上的收斂性以及收斂速度.與現(xiàn)有使用鄰居相對(duì)狀態(tài)的完整信息的分布式梯度下降算法相比,所提算法的收斂速度并沒(méi)有本質(zhì)上降低.另一方面,將所提算法擴(kuò)展到確定性和隨機(jī)性的時(shí)變圖上,并給出相應(yīng)的收斂性結(jié)論.最后,通過(guò)數(shù)值仿真實(shí)驗(yàn)驗(yàn)證算法的有效性.
關(guān)鍵詞
分布式優(yōu)化;多智能體網(wǎng)絡(luò);相對(duì)狀態(tài)符號(hào);懲罰函數(shù)法;次梯度方法
中圖分類號(hào)? TP13
文獻(xiàn)標(biāo)志碼? A
0 引言
近些年來(lái),多智能體系統(tǒng)中的分布式優(yōu)化問(wèn)題得到了越來(lái)越多的關(guān)注.分布式優(yōu)化是指多智能體系統(tǒng)中的所有智能體協(xié)作尋找一個(gè)全局目標(biāo)函數(shù)的最小值點(diǎn),這個(gè)全局目標(biāo)函數(shù)是每個(gè)智能體的局部目標(biāo)函數(shù)之和.分布式優(yōu)化問(wèn)題的一個(gè)顯著特點(diǎn)是每個(gè)智能體只知道其對(duì)應(yīng)的局部目標(biāo)函數(shù),因此智能體之間必須合作.通常用圖來(lái)表示智能體之間的通信拓?fù)浣Y(jié)構(gòu).分布式優(yōu)化問(wèn)題在實(shí)際中有很多應(yīng)用,比如AUV(自主式水下機(jī)器人)的隊(duì)形控制[1] 、大規(guī)模機(jī)器學(xué)習(xí)[2-4] 以及傳感器網(wǎng)絡(luò)中的分位點(diǎn)回歸問(wèn)題[5] 等.
求解這類問(wèn)題的主要算法(參考文獻(xiàn)[6-8]以及文中所引用的參考文獻(xiàn))通常由兩部分組成.第一部分是為了使所有智能體的狀態(tài)趨于一致,我們稱之為一致項(xiàng),它通常基于現(xiàn)有的一致性算法[9] .第二部分是使達(dá)到一致后智能體的狀態(tài)收斂到全局目標(biāo)函數(shù)的最小值點(diǎn),這一部分可利用的信息通常是每個(gè)智能體局部目標(biāo)函數(shù)的值及其次梯度,因此次梯度算法被廣泛采用.為了實(shí)現(xiàn)所有智能體狀態(tài)一致的目標(biāo),現(xiàn)有主流算法要求每個(gè)智能體能夠獲得它的鄰居智能體的狀態(tài)的精確值[8-9] 或者量化值[10-11] .但是,有些情況下智能體只能獲得它和它鄰居之間粗略的相對(duì)狀態(tài)信息.比如,考慮這樣的場(chǎng)景,多個(gè)智能體(機(jī)器人)在二維平面上運(yùn)動(dòng),每個(gè)智能體只知道鄰近智能體是在它的左邊或者右邊、上邊或者下邊,而不是準(zhǔn)確的相對(duì)位置信息.因此,這種情形下每個(gè)智能體在每個(gè)坐標(biāo)軸下只能從鄰居那里獲得相對(duì)位置的符號(hào)信息.這與智能體的量化狀態(tài)工作信息[11] 顯著不同,這些工作利用智能體的狀態(tài)量化值而不是精確值.此外,也有別于利用量化梯度信息的研究[12] .總之,大部分已有的工作,都不能處理只使用相對(duì)狀態(tài)符號(hào)信息的情況.除此之外,本文所提算法的另一個(gè)優(yōu)點(diǎn)是不要求智能體之間的交互矩陣滿足雙隨機(jī)性,而只要求是對(duì)稱的.注意到雙隨機(jī)矩陣的假設(shè)在已有算法中很常見(jiàn)[6,13-14] ,但是在實(shí)際的分布式優(yōu)化問(wèn)題中可能難以滿足.比如,常用的構(gòu)造雙隨機(jī)矩陣的Metropolis方法需要每個(gè)智能體都知道鄰居的出度,這在實(shí)際應(yīng)用中可能無(wú)法滿足,特別是當(dāng)智能體之間的通信拓?fù)涫菚r(shí)變的情形.
設(shè)計(jì)基于相對(duì)狀態(tài)符號(hào)信息的算法往往涉及到非線性分析,因此和大多數(shù)利用代數(shù)圖論作為分析工具的算法有本質(zhì)不同.在文獻(xiàn)[15]中,作者提出了一種只利用1比特相對(duì)狀態(tài)信息的一致性算法,? 但并沒(méi)有考慮優(yōu)化問(wèn)題.類似的算法可以用來(lái)計(jì)算數(shù)據(jù)的中位點(diǎn)[16] .文獻(xiàn)[17]中的算法和本文最為接近,但它是一個(gè)連續(xù)時(shí)間下的算法,并且采用了和本文完全不同的分析方法,我們將在后文進(jìn)行更深入的比較.
實(shí)際上,現(xiàn)有使用相對(duì)狀態(tài)符號(hào)信息的工作都是考慮連續(xù)時(shí)間下的情況.但是,離散時(shí)間下的算法是有必要研究的,因?yàn)楹芏嗨惴ǘ家蟛煌悄荏w之間能夠進(jìn)行通信并需要對(duì)智能體進(jìn)行控制,而目前實(shí)際應(yīng)用中的通信和控制信號(hào)往往是離散的.除此之外,離散時(shí)間下的算法實(shí)施起來(lái)也更簡(jiǎn)單.另一方面,連續(xù)時(shí)間下的算法往往基于李雅普諾夫理論,通過(guò)構(gòu)造李雅普諾夫函數(shù)進(jìn)行分析,而離散時(shí)間算法中引入的步長(zhǎng)通常不能保證李雅普諾夫函數(shù)單調(diào)不增.因此,李雅普諾夫方法很難應(yīng)用在離散時(shí)間算法的分析中,需要新的分析工具來(lái)研究這種非線性離散時(shí)間下的算法,這也是本文關(guān)心的問(wèn)題.
具體來(lái)說(shuō),本文討論離散時(shí)間下只利用相對(duì)狀態(tài)符號(hào)信息的分布式優(yōu)化算法.和大多數(shù)已有方法不同,本文的分析基于優(yōu)化理論而不是代數(shù)圖論或李雅普諾夫理論.采用優(yōu)化理論進(jìn)行分析有兩點(diǎn)潛在的好處.首先,目前主要研究思路是先提出一個(gè)算法,然后找到一個(gè)李雅普諾夫函數(shù)來(lái)證明算法的穩(wěn)定性和收斂性.與之相比,基于優(yōu)化理論得到的算法在直觀上更容易被理解和接受,因?yàn)槲覀兊乃惴▽?shí)質(zhì)上是在最小化一個(gè)經(jīng)過(guò)精心設(shè)計(jì)的目標(biāo)函數(shù).其次,豐富的優(yōu)化理論使得所提算法更容易擴(kuò)展和改進(jìn).比如,我們?cè)跁r(shí)變圖上得到的算法就是在靜態(tài)圖上算法的一個(gè)直接擴(kuò)展.具體來(lái)說(shuō),我們將靜態(tài)圖上的算法擴(kuò)展到了確定性時(shí)變圖和隨機(jī)時(shí)變圖上,其中確定性時(shí)變圖能夠用來(lái)表示實(shí)際應(yīng)用中智能體間時(shí)變的通信拓?fù)洌S機(jī)時(shí)變圖則可以描述gossip網(wǎng)絡(luò)和通信網(wǎng)絡(luò)中的隨機(jī)丟包等現(xiàn)象.基于優(yōu)化理論,本文分別采用了增量式梯度方法和隨機(jī)梯度法來(lái)設(shè)計(jì)和分析確定性時(shí)變圖和隨機(jī)時(shí)變圖上的算法.
本文的主要結(jié)論如下:對(duì)于一個(gè)靜態(tài)圖,證明了所有智能體的狀態(tài)在所提算法下都會(huì)漸近收斂到全局目標(biāo)函數(shù)的同一個(gè)最小值點(diǎn),并且收斂速度相對(duì)于已有算法沒(méi)有本質(zhì)上的降低;對(duì)于確定性時(shí)變圖,引入了一致連通的概念,并且證明了算法在一致連通圖下的收斂性;對(duì)于隨機(jī)時(shí)變圖,考慮了一種特殊的情況——隨機(jī)激活圖,并且在概率1的意義下證明了算法在隨機(jī)激活圖下的收斂性.
本文其余部分的結(jié)構(gòu)如下:第1節(jié)介紹一些預(yù)備知識(shí)并且給出分布式優(yōu)化問(wèn)題的定義;第2節(jié)介紹所提出的離散時(shí)間下基于相對(duì)狀態(tài)符號(hào)信息的分布式優(yōu)化算法;第3節(jié)包含算法在靜態(tài)圖上的收斂性和收斂速度分析;第4節(jié)研究算法在一致連通圖和隨機(jī)激活圖上的收斂性;在第5節(jié)中利用所提算法進(jìn)行數(shù)值仿真并將仿真結(jié)果和理論結(jié)果進(jìn)行了比較;第6節(jié)對(duì)全文進(jìn)行總結(jié).
6 總結(jié)
本文討論了一種分布式優(yōu)化算法求解多智能體系統(tǒng)中具有加和形式目標(biāo)函數(shù)的優(yōu)化問(wèn)題,該算法的特點(diǎn)是只需要利用節(jié)點(diǎn)之間相對(duì)狀態(tài)的符號(hào)信息.此外,該算法可以應(yīng)用于靜態(tài)以及時(shí)變的網(wǎng)絡(luò)結(jié)構(gòu).在靜態(tài)的網(wǎng)絡(luò)結(jié)構(gòu)下,首先利用優(yōu)化理論中的懲罰函數(shù)方法來(lái)解釋了所提算法,然后分析了算法在衰減步長(zhǎng)和常數(shù)步長(zhǎng)下的收斂性,最后證明了算法的收斂速度介于O(1/ln(k))和O(1/ k )之間.在時(shí)變網(wǎng)絡(luò)下,我們討論了算法在一致連通圖和隨機(jī)激活圖下的性能,并給出了相應(yīng)的收斂性結(jié)論.最后,通過(guò)一個(gè)分布式尋找?guī)缀沃行牡睦?,利用?shù)值仿真實(shí)驗(yàn)說(shuō)明了算法的有效性,并進(jìn)一步驗(yàn)證了理論結(jié)果.
致謝? 本文作者非常感謝Tamer Basarar 教授對(duì)本文內(nèi)容提出的一些寶貴建議.本文工作受到了國(guó)家自然科學(xué)基金(61722308)? 以及國(guó)家重點(diǎn)基礎(chǔ)研究專項(xiàng)基金(2017YFC0805310)的資助.
參考文獻(xiàn)
References
[ 1 ]?You ?K,Xie L.Network topology and communication data rate for consensusability of discrete-time multi-agent systems[J].IEEE Transactions on Automatic Control,2011,56(10):2262-2275
[ 2 ] You K,Tempo R,Xie P.Distributed algorithms for robust convex optimization via the scenario approach[J].IEEE Transactions on Automatic Control,2018,DOI:10.119/TAC.2018.2828093
[ 3 ] Cevher V,Becker S,Schmidt M.Convex optimization for big data:scalable,randomized,and parallel algorithms for big data analytics[J].IEEE Signal Processing Magazine,2014,31(5):32-43
[ 4 ] Nedi c ??'? A,Olshevsky A,Rabbat M G.Network topology and communication-computation tradeoffs in decentralized optimization[J].Proceedings of the IEEE,2018,106(5):953-976
[ 5 ] Wang H,Li C.Distributed quantile regression over sensor networks[J].IEEE Transactions on Signal & Information Processing Over Networks,2017,DOI:10.1109/TSIPN.2017.2699923
[ 6 ] Nedic A,Ozdaglar A.Distributed subgradient methods for multi-agent optimization[J].IEEE Transactions on Automatic Control,2009,54(1):48-61
[ 7 ] Li ?T,F(xiàn)u M,Xie L,et al.Distributed consensus with limited communication data rate[J].IEEE Transactions on Automatic Control,2011,56(2):279-292
[ 8 ] Nedic A,Olshevsky A.Distributed optimization over time-varying directed graphs[C]∥Decision and Control.IEEE,2013:6855-6860
[ 9 ] Olfati-Saber R,Murray R M.Consensus problems in networks of agents with switching topology and time-delays[J].IEEE Transactions on Automatic Control,2004,49(9):1520-1533
[10] Yi P,Hong Y.Quantizedsubgradient algorithm and data-rate analysis for distributed optimization[J].
IEEE Transactions on?Control of Network Systems,2014,1(4):380-392
[11] Pu Y,Zeilinger M N,Jones C N.Quantization design for distributed optimization[J].IEEE Transactions on Automatic Control,2017,62(5):2107-2120
[12] Magnússon ?S,Enyioha C,Li N,et al.Convergence of limited communications gradient methods[J].IEEE Transactions on Automatic Control,2017,DOI:10.1109/TAC.2017,2743678
[13] Shi W,Ling Q,Wu G,et al.EXTRA:an exact first-order algorithm for decentralized consensus optimization[J].SIAM Journal on Optimization,2014,25(2):944-966
[14] Mokhtari ?A,Ling Q,Ribeiro A.Network newton distributed optimization methods[J].IEEE Transactions on Signal Processing,2017,65(1):146-161
[15] Chen ?G,Lewis F L,Xie L.Finite-time distributed consensus via binary control protocols[J].Automatica,2011,47(9):1962-1968
[16] Franceschelli M,Giua A,Pisano A.Finite-time consensus on the median value with robustness properties[J].IEEE Transactions on Automatic Control,2017,62(4):1652-1667
[17] Lin P,Ren W,F(xiàn)arrell J A.Distributed continuous-time optimization:nonuniform gradient gains,finite-time convergence,and convex constraint set[J].IEEE Transactions on Automatic Control,2017,62(5):2239-2253
[18] Clarke F H,Stern R J,Ledyaev Y S,et al.Nonsmooth analysis ?and control theory[J].Graduate Texts in Mathematics,1998,178(7):137-151
[19] Bertsekas ?D P.Convex optimization algorithms[M].Athena Scientific,2016
[20]ZhangJ,YouK,Basar T.Distributed discrete-time optimization in multi-agent networks using only sign of relative state[J].IEEE Transactions on Automatic Control,2018,Conditionally Accepted
[21] ZHANG ?Jiaqi,YOU Keyou.Distributed optimization in multi-agent networks using one-bit of relative state information[M]∥Basar Tamer.Uncertainty in Complex Networked Systems,Birkuser,2018
[22] Zhang ?J,You K.Distributed optimization with binary relative information over deterministically time-varying graphs[C]∥The 57th IEEE Conference on Decision and Control,Miami Beach,F(xiàn)L,USA,2018,Accepted
[23] Borkar ?V S.Stochastic approximation:a dynamical systems viewpoint[C]∥Baptisms 91 Witnesses,2008
[24] CohenM B,Lee Y T,Miller G,et al.Geometric median in nearly linear time[C]∥Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing,ACM,2016:9-21
[25] Beck ?A,Sabach S.Weiszfelds method:old and new results[J].Journal of Optimization Theory and Applications,2015,164(1):1-40
Distributed optimization using the sign of relative state information
ZHANG Jiaqi1,2 ?YOU Keyou1,2
1 Department of Automation,Tsinghua University,Beijing 100084
2 Beijing National Research Centre for Information Science and Technology,Tsinghua Univesity,Beijing 100084
Abstract? The design of distributed discrete-time algorithms to cooperatively solve an additive cost optimization problem in multi-agent networks is presented in this paper.The striking feature of the distributed algorithms lies in the use of only the sign of the relative state information between neighbors;which substantially differentiates our algorithms from the existing ones.Moreover,the algorithm does not require the interaction matrix to be doubly stochastic.We first interpret the proposed algorithms in terms of the penalty method in the optimization theory and then perform a non-asymptotic analysis to study the convergence for static network graphs.Compared with the celebrated distributed subgradient algorithms,which,however,use the exact relative state information,the convergence speed in the proposed algorithms is essentially not affected by the loss of information.We also extend our results to the cases of deterministically and randomly time-varying graphs.Finally,we validate the theoretical results through simulations.
Key words? distributed optimization;multi-agent networks;sign of relative state;penalty method;subgradient iterations
南京信息工程大學(xué)學(xué)報(bào)2018年6期