鄭月龍,張衛(wèi)國,2
?
多人演化雪堆博弈的合作動(dòng)態(tài)研究
鄭月龍1,張衛(wèi)國1,2
(1.重慶大學(xué)經(jīng)濟(jì)與工商管理學(xué)院,重慶400044;2.西南大學(xué)經(jīng)濟(jì)管理學(xué)院,重慶400715)
演化博弈在研究各種規(guī)模的合作問題方面處于中心地位。在全混合群體下,從現(xiàn)有雪堆模型抑制合作的缺陷出發(fā),將時(shí)間成本作為決策參數(shù)引入現(xiàn)有模型,構(gòu)造出一個(gè)考慮時(shí)間成本的多人演化雪堆博弈模型,并通過數(shù)值模擬對(duì)現(xiàn)有和新構(gòu)造的模型進(jìn)行了比較分析。研究結(jié)果表明:新模型可在一定程度上克服現(xiàn)有模型內(nèi)生性不足問題,時(shí)間成本與收益成本比共同構(gòu)成代理人策略選擇的重要變量,兩者的作用力同向且可相互替代,而群體規(guī)模對(duì)代理人的合作行為具有明顯的抑制作用,權(quán)衡正反兩個(gè)方面的力量可促使合作行為的產(chǎn)生。
雪堆模型;多人演化博弈;合作動(dòng)態(tài)
演化博弈理論在研究各種規(guī)模合作的出現(xiàn)及演化方面扮演著核心角色,并得到學(xué)者們?cè)絹碓蕉嗟年P(guān)注[1~3]。傳統(tǒng)上,學(xué)者們將人們之間的交互行為從一次的、對(duì)稱的二人合作困境視角進(jìn)行了建模,如囚徒困境、雪堆博弈及獵鹿博弈[4]。然而,現(xiàn)實(shí)世界中更經(jīng)常涉及多于兩人的集體決策問題。這類合作行為最好放在多人博弈的框架內(nèi)進(jìn)行研究[5~7],典型代表是公共物品博弈(PGG)。在當(dāng)前有關(guān)雪堆模型和PGG研究的啟發(fā)下,首先通過分析現(xiàn)有的雪堆博弈及多人雪堆博弈(NSG,N-person Snowdrift Game)[8~10],指出現(xiàn)有模型的不足;在此基礎(chǔ)上通過引入時(shí)間成本因素構(gòu)造出一個(gè)考慮時(shí)間成本的多人雪堆博弈(DCNSG,Delay cost NSG),并對(duì)其進(jìn)行求解;最后通過數(shù)值模擬比較性地研究了兩種多人演化博弈模型。
在標(biāo)準(zhǔn)SG中,路上行駛的兩名司機(jī)同時(shí)被一個(gè)雪堆擋住,只有將雪堆鏟走他們才能繼續(xù)前往目的地。此時(shí),可能發(fā)生3種情況:兩個(gè)人都不鏟雪,因此沒有人能夠順利前往目的地;這兩個(gè)司機(jī)合作鏟雪,他們都能到達(dá)目的地,兩個(gè)共同承擔(dān)由鏟雪而產(chǎn)生的成本;如果僅僅一個(gè)人鏟雪,兩人均可到達(dá)目的地,但只有鏟雪者承擔(dān)了鏟雪的全部成本。將上述支付及符號(hào)做如下定義:達(dá)到目的地獲得的收益為,鏟雪的全部成本為;如果合作鏟雪,每人獲得b-c/2;如果不合作,兩人獲得的收益都是0;如果只有一人合作,合作者(C)獲得的收益為,不合作者(D)獲得。通常假設(shè)收益大于成本,這樣就可以得到一個(gè)類似斗雞、鷹鴿或雪堆困境的排序特征的支付[11],也只有當(dāng)b﹥c﹥0時(shí)博弈的參與者選擇合作行為才是有利可圖的,其收益受到其它參與者所采取策略的影響,這樣的參與者在雪堆博弈中被稱為代理人。進(jìn)一步地,可以想象若這個(gè)雪堆將個(gè)司機(jī)同時(shí)阻擋在十字路口,所有人均想到達(dá)目的地,以獲得相同的支付,然而,并不是所有人愿意付出勞動(dòng)而合作鏟雪,如果所有人合作鏟雪,那么每個(gè)人獲得b-c/N,如果有k(k≥1)個(gè)代理人合作鏟雪(C),則每人獲得b-c/k的支付,而那些拒絕鏟雪者(D)不用付出成本就可達(dá)到目的地并獲得的收益,這樣就將標(biāo)準(zhǔn)雪堆模型一般化為涉及多人()的雪堆博弈(NSG),多人雪堆博弈的支付為:
(2)
由(1)和(2)可知,在多人雪堆博弈中,參與者不合作是其最優(yōu)策略,因?yàn)椴缓献鞑呗阅軌虮WC他得到的支付至少不低于其他參與者,這顯然抑制了合作現(xiàn)象的產(chǎn)生,這是模型的內(nèi)生性不足;另外,所有參與者都不合作時(shí),現(xiàn)有模型表明代理人的支付均為零,暗含著這樣的假設(shè):時(shí)間的延誤和到目的地所辦的事情均無足輕重,這顯然與現(xiàn)實(shí)和常理不符,因?yàn)槿祟惔蠖际菑氖掠心康?、有意識(shí)的活動(dòng),若都不合作鏟雪,代理人就會(huì)因時(shí)間延誤而遭致賠償、上級(jí)懲罰及焦慮等物質(zhì)和精神上的損失,其支付應(yīng)為負(fù)數(shù)而不是零;進(jìn)一步地,即使部分代理人甚至全部代理人都鏟雪,鏟雪工作也不可能一蹴而就,不管是合作代理人還是拒絕鏟雪者,在雪堆被徹底鏟走之前,都必須承擔(dān)與自己前往目的地所辦事情及其心理狀況相關(guān)的損失,例如不鏟雪的代理人必須等待著直到雪堆被鏟走,這明顯與到達(dá)目的地所辦事情有關(guān),若到達(dá)目的地所辦事情越重要,代理人承擔(dān)的等待成本就會(huì)越大,那么代理人就越可能選擇合作,因?yàn)殓P雪的人越多,鏟掉雪堆所花的時(shí)間就會(huì)越短,代理人就可能因時(shí)間節(jié)約而獲益或減少損失,這也是現(xiàn)有模型所忽略的。針對(duì)以上不足,可對(duì)現(xiàn)有NSG模型進(jìn)行擴(kuò)展和改進(jìn)。
更接近現(xiàn)實(shí)的模型需要設(shè)置更多額外的參數(shù)[8],將時(shí)間參數(shù)引入現(xiàn)有NSG模型以實(shí)現(xiàn)對(duì)其的改進(jìn),可使模型更切合現(xiàn)實(shí),為此算式(1)和(2)可擴(kuò)展為:
(4)
由于存在有限理性,代理人之間難以在最初就合作鏟雪,而是一個(gè)不斷學(xué)習(xí)的動(dòng)態(tài)互動(dòng)過程。全混合(well-mixed)群體下的多人演化雪堆博弈中的演化行為可通過復(fù)制動(dòng)力學(xué)來表現(xiàn)[12],合作者的概率為,為群體中合作代理人在時(shí)間的數(shù)量[13~14],那么不合作者的概率為,的時(shí)間演化可由以下微分方程給出[12]:
(6)
(8)
(10)
將(3)和(4)式代入(6)和 (7)并結(jié)合(10)式得:
利用恒等式:
(12)
可得:
于是有
(14)
使用方程(14)于(11)可得:
以上便是考慮時(shí)間成本的多人雪堆博弈模(DCNSG)在穩(wěn)定狀態(tài)時(shí)關(guān)于的N階解析方程。
圖1 w=0時(shí),穩(wěn)定態(tài)x的數(shù)值模擬結(jié)果
進(jìn)一步地,在圖2(右)中,將群體規(guī)??刂茷镹=30時(shí),總體來說,合作代理人的穩(wěn)定均衡態(tài)x隨著時(shí)間成本(w)和收益-成本比(b/c)的增大而逐漸增加,當(dāng)時(shí)間成本較小(w=2)時(shí)穩(wěn)定態(tài)x隨著b/c的增加而增加,隨著w的增加穩(wěn)定均衡水平越來越高,時(shí)間成本增加到較高水平(w=1010)時(shí),任意b/c水平都會(huì)使代理人傾向于選擇合作行為;類似地,如圖3(左)所示,當(dāng)收益-成本比增加到較高水平(b/c=1015)時(shí),任意時(shí)間成本下代理人也都傾向于選擇合作行為,以上結(jié)果實(shí)質(zhì)上是促使代理人合作的正反兩方面的因素,它們發(fā)揮作用的方向相同,時(shí)間成本足夠大或收益-成本比足夠大,都將促使代理人由于考慮到不合作的損失太大而趨于選擇合作,表明時(shí)間成本和收益-成本比具有相互替代性,由于時(shí)間成本是與到達(dá)目的地所辦事情重要程度相關(guān)的產(chǎn)物,因此,以上結(jié)果表明時(shí)間成本的植入使合作行為的出現(xiàn)成為可能,進(jìn)一步也說明了代理人到達(dá)目的地所辦事情的收益(進(jìn)而b/c)越大,也即到達(dá)目的地所辦事情越重要,相應(yīng)代理人的時(shí)間成本就越大的假說。
為了進(jìn)一步說明時(shí)間成本w對(duì)穩(wěn)定態(tài)x的影響,取收益-成本比為b/c=5的情況下,如圖3(右)所示,當(dāng)N取值為2,5,10時(shí),穩(wěn)定態(tài)x隨著w的增加而較快速的增加;當(dāng)N取值大于20時(shí),穩(wěn)定態(tài)x隨著w的增加而增加的速度受到N增大的影響而明顯減緩,且穩(wěn)定均衡水平也隨之下降,進(jìn)一步說明了N抑制了w作用的發(fā)揮,從而抑制了合作行為的產(chǎn)生,這與圖1和圖2模擬結(jié)果是相同的,可能的原因是群體規(guī)模較小的時(shí)候協(xié)調(diào)起來比較容易,例如只有一個(gè)人的時(shí)候,別無選擇只能選擇鏟雪,而隨著群體規(guī)模的擴(kuò)大協(xié)調(diào)變得愈加困難(如搭便車者或磨洋工者增多),因此,若要讓代理人真正合作鏟雪,在考慮時(shí)間成本及收益成本比的基礎(chǔ)上,還需借助協(xié)調(diào)、激勵(lì)等手段。
現(xiàn)有雪堆博弈模型的博弈結(jié)果將導(dǎo)致不合作,將時(shí)間成本w考慮進(jìn)雪堆模型,構(gòu)造出一個(gè)考慮時(shí)間成本的多人雪堆博弈模型(DCNSG),借助數(shù)值模擬,植入時(shí)間成本后的模型表明:新的博弈模型對(duì)于克服現(xiàn)有模型抑制合作的不足有一定的效果,時(shí)間成本的植入使得雪堆博弈合作行為出現(xiàn)成為可能,時(shí)間成本和收益-成本比起作用的方向相同且具有相互替代性,是代理人行為選擇的兩個(gè)重要決策變量,而群體規(guī)模對(duì)合作起到較大的抑制作用,權(quán)衡正反兩種力量可促使代理人選擇合作行為。上述結(jié)論對(duì)促進(jìn)公共物品博弈(PPG)中(如公共工程建造、公共環(huán)境衛(wèi)生維護(hù)等)的合作問題有一定的啟示:通過收益成本比考察和評(píng)價(jià)代理人到達(dá)目的地所辦事情的重要程度,進(jìn)而衡量代理人的時(shí)間成本以判斷代理人的合作意愿,并通過積極溝通、協(xié)調(diào)和建立信任等方式增加代理人合作意愿和合作效率。
[1] Macy M, Flache A. Learning dynamics in social dilemmas[J].Proc Natl Acad Sci U S A, 2002, 99:7229-7236.
[2] Nowak MA. Five rules for the evolution of cooperation[J].Science, 2006,314(5805):1560-1563.
[3] Sigmund K. The Calculus of Selfishness[M].Princeton: Princeton University Press, 2009.49-80.
[4] Santos MD, Pinheiro FL, Santos FC, et al. Dynamics of N-person snowdrift games in structured populations[J]. Journal of Theoretical Biology, 2012, 315:81-86.
[5] Gokhale CS, Traulsen A. Evolutionary games in the multiverse[J]. Proc Natl Acad Sci U S A, 2010, 107(12):5500-5504.
[6] Santos FC, Pacheco JM. Risk of collective failure provides an escape from the tragedy of the commons[J].Proc Natl Acad Sci U S A, 2011, 108(26):10421-10425.
[7] Van Segbroeck S, Pacheco JM, Lenaerts T, et al. Emergence of fairness in repeated group interactions[J].Physical Review Letters, 2012, 108(15):1-5.
[8] Zheng DF, Yin HP, Chan CH, et al. Cooperative behavior in a model of evolutionary snowdrift games with N-person interactions[J]. Europhysics Letters, ?2008, 80:1-4.
[9] Galbiati R, Vertova P. Obligations and cooperative behaviour in public good games[J].Games and Economic Behavior,2008,64(1):146-170.
[10] Souza MO, Pacheco JM, Santos FC. Evolution of cooperation under N-person snowdrift games[J].Journal of Theoretical Biology, 2009, 260(4):581-588.
[11] Maynard Smith J. Evolution and the theory of games[M].Cambridge: Cambridge University Press, 1982.10-27.
[12] Hofbauer J, Sigmund K. Evolutionary games and population dynamics[M].Cambridge: Cambridge University Press, 1998.57-79.
[13] Hauert C, Doebeli M. Spatial structure often inhibits the evolution of cooperation in the snowdrift game[J].Nature, 2004, 428(6983) : 643 - 646.
[14] Zhong LX, Zheng DF, Zheng B, et al. Networking effects on cooperation in evolutionary snowdrift game[J].Europhysics Letters, 2006, 76(4):724-730.
[15] Hauert C, Michor F, Nowak MA, et al. Synergy and discounting of cooperation in social dilemmas[J].Journal of Theoretical Biology, 2006, 239(2):195-202.
Cooperation Dynamic underN-person Snowdrift Games
ZHENG Yue-long1, ZHANG Wei-guo1,2
(1.School of Economics and Business Administration, Chongqing University, Chongqing 400044, China;2.College of Economics and Management, Southwest University, Chongqing 400715, China)
Evolutionary game theory plays a central role in the study of the emergence and evolution of cooperation at all scales. Traditionally, interactions have been modeled by scholars in terms of one-shot, symmetric two-person dilemmas of cooperation, such as the Prisoner’s Dilemma, the Snowdrift Game and the Stag-Hunt Game. However, such situations are often met by us in the real word. For instance, accomplishing a task often needs several group members to corporate. They bear all the related costs, while others who don’t corporate only share the benefits after achieving the task. Therefore, the collective decision, derived from the groups which have more than two people, is involved. This kind of cooperative issues is best studied in the framework of N-person games, such as the typical public goods game (PGG).
Motivated by the recent works on PGG and N-person Snowdrift Games (NSG), we found the following drawbacksfrom the analysis of the existing snowdrift game model. Firstly, the non-cooperation strategy may ensure that the payment is not less than the other participants’. Non-cooperation is the optimal strategy in NSG, which constitutes endogenous drawbacks of the model. Then, the existing model shows that agents obtain zero payment if all participants don’t cooperate, which implies that it is of little significance for agents to delay time and the things done at the destination. Obviously, it is not in conformity with reality and common sense for the reason that human beings mainly engage in activities with purpose and consciousness. If they all refuse to shovel snow, agents will suffer spiritual and material losses such as anxiety and penalty resulting from delays. Thus, their payment should be negative rather than zero. Finally, even if some or all agents shovel snow, snowdrift is also unlikely to be shoveled out in one step. Before the snow is thoroughly shoveled out, the losses resulting from the importance of doing things at the destination and their psychological condition must be undertaken by both the cooperator and the refusal one. This surely affects the agent’s strategic choices, but ignored by the existing model.
To further study agent’s cooperation dynamic under well-mixed population and based on the defects of the going snowdrift game, we build a Delay Cost N-person Snowdrift Games (DCNSG) by incorporating delay cost into the existing model, and analyze these two models from a comparative perspective based on numerical methods using MATLAB. The study reveals that DCNSG is somewhat effective for overcoming the shortcomings of the existing model.The implanted delay cost makes the system cooperation possible and plays the role in the same direction with the benefit-cost ratio. These roles can be interchangeable and constitute the important decision variables for agents.However, the group size restrains the emergence of cooperation obviously. Cooperation behavior can appear via weighing positive and negative aspects.
Our findings have important implications for solving PPG cooperation problems, such as public engineering construction, public environmental maintenance, etc. Specifically, agents should be examined and evaluated in terms of benefit-cost ratio and the importance of doing things to their destination. In addition, the time cost of agent should be measured in order to reveal their cooperation intention. Simultaneously, their cooperation willingness and efficiency should be improved by positively communicating, cooperating and establishing trust.
snowdrift game; N-person evolutionary game; cooperation dynamic
中文編輯:杜 ??;英文編輯:Charlie C. Chen
O152.1
A
1004-6062(2016)04-0112-05
10.13587/j.cnki.jieem.2016.04.014
2013-10-08
2014-06-11
教育部高等學(xué)校博士學(xué)科點(diǎn)科研基金資助項(xiàng)目(20130191120058);國家社科基金重大資助項(xiàng)目(12&ZD100)
鄭月龍(1981—),男,內(nèi)蒙古太仆寺旗人;重慶大學(xué)經(jīng)濟(jì)與工商管理學(xué)院博士研究生,研究方向:博弈論,企業(yè)戰(zhàn)略與創(chuàng)新管理。