• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Stochastic Gradient Compression for Federated Learning over Wireless Network

    2024-04-28 11:59:54LinXiaohanLiuYuanChenFangjiongHuangYangGeXiaohu
    China Communications 2024年4期

    Lin Xiaohan ,Liu Yuan,* ,Chen Fangjiong ,Huang Yang ,Ge Xiaohu

    1 School of Electronic and Information Engineering,South China University of Technology,Guangzhou 510641,China

    2 Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space,Ministry of Industry and Information Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China

    3 School of Electronic Information and Communications,Huazhong University of Science and Technology,Wuhan 430074,China

    Abstract: As a mature distributed machine learning paradigm,federated learning enables wireless edge devices to collaboratively train a shared AI-model by stochastic gradient descent (SGD).However,devices need to upload high-dimensional stochastic gradients to edge server in training,which cause severe communication bottleneck.To address this problem,we compress the communication by sparsifying and quantizing the stochastic gradients of edge devices.We frist derive a closed form of the communication compression in terms of sparsifciation and quantization factors.Then,the convergence rate of this communicationcompressed system is analyzed and several insights are obtained.Finally,we formulate and deal with the quantization resource allocation problem for the goal of minimizing the convergence upper bound,under the constraint of multiple-access channel capacity.Simulations show that the proposed scheme outperforms the benchmarks.

    Keywords: federated learning;gradient compression;quantization resource allocation;stochastic gradient descent(SGD)

    I.INTRODUCTION

    As the boom of smart mobile devices,abundant data is produced at the network edge.Motivated by the valuable data and progressive artifciial intelligence (AI)algorithms,intelligent applications tremendously enhance productivity and effciiency in people’s daily lives.The transfer of the processing platform from the centralized datacenters in cloud to the user side in network edge boosts the development of edge AI or edge intelligence[1-6].

    The movement of data processing from cloud to edge is highly non-trivial.To be specifci,cloud AI focuses on processing data under given aggregated global data.The central server needs to acquire local data from distributed edge devices via wireless links,and thus there inevitably exists a big challenge:transmitting tremendous amounts of data through radio access network (RAN) raises high costs,such as delay and energy consumption.In this case,wireless communication becomes the bottleneck of edge intelligence.Besides,the personal privacy may be violated when uploading raw data.Federated learning is a popular distributed machine learning pattern,where edge devices cooperatively train a shared AI model and local data is prevented from leaving edge devices [7-9].In particular,edge devices train their local models based on local datasets,and the local models/gradients are uploaded to edge server.Edge server use these local gradients to update the global model,and new global model is broadcast to edge devices for their local training.This server-device exchange is repeated until a certain level of global model accuracy is maintained.

    However,the server-device exchange leads to severe communication bottleneck.The reason is that the trained models are usually high-dimensional while the radio resource is scare and the edge devices are capability-limited.Therefore,the gradients/models need to be compressed before transmission to reduce communication overhead.In addition,the channel conditions are time-varying and various across devices,so the compression needs to be adaptive to channel conditions.The above observations motivate us to study joint gradient compression and design a quantization resource allocation scheme.

    1.1 Related Works

    Federated learning can be studied by stochastic gradient descent (SGD) algorithm [7,10].In practice,stochastic gradients contain several high-dimensional vectors or arrays,which need large amounts of communication resources.To deal with this issue,two methods are developed to compress stochastic gradient.The frist is sparsifciation [11,12].After sparsifciation,only elements larger than the predefnied threshold are remained and uploaded.In [13,14],digital-distributed SGD(D-DSGD)method with coding and analog-distributed SGD (A-DSGD) method based on signal superposition are applied.In [15],momentum correction and local gradient clipping are jointly applied to promote model accuracy after sparsifciation.The second method is quantization,in which the single component or blocks of gradients are quantized to codes.For example,the authors in [16]studies a kind of scalar quantization method,quantized SGD(QSGD),where vector components are uniformly quantized.It is proved that QSGD makes a compromise between quantization variance and model accuracy.One-bit quantization and signSGD belongs to scalar quantization,in which the momentum in descent also achieve excellent performance compared with its counterpart [17,18].In [19],the Terngrad method quantizes the positive,negative and zero elements of gradients to be +1,-1 and 0,respectively.The authors in [20] use low dimensional Grassmannian manifolds to reduce the dimension of local model gradient.In [21],the local model update is compressed by dimension reduction and vector quantization.In [22],the authors focus on model compression during the training in federated learning,where a layered ternary quantization method is used to compress local model networks and different quantization thresholds are used in different layers.

    Another feild of federated learning is the resource allocation optimization.In [23],edge devices upload their local models to BS via frequency domain multiple access(FDMA).The bandwidth allocation,transmit power,local computation frequency and communication time are jointly optimized to enhance energy effciiency.[24]provides a comprehensive research on the correlation between training performance and the effect of wireless network,measured by the packet error rate.The resource allocation and user selection are jointly optimized to minimize the training loss subject to the delay and energy consumption constraints.In [25],authors design a broadband analog aggregation scheme via over-the-air computation and study the communication-learning tradeoff.The authors in[26]present the federated learning scheme with overthe-air computation,and jointly optimize device selection and beamforming design.This problem is modeled as a sparse and low-rank optimization.The authors in [27] consider the federated learning over a multiple access channel(MAC)and optimize the local gradient quantization parameter.In [28],the quantization error and the channel fading are characterized in terms of received signal-to-noise ratio (SNR) and quantization resource allocation is studied.The authors in [29] propose a time-correlated sparsifciation method with hybrid aggregation(TCS-H),and jointly optimize the model compression and over-the-air computation to minimize the number of required communication time slots.The energy consumption of local computing and wireless communication are balanced in [30] by optimizing gradient sparsifciation and the number of local iterations.Adaptive Top-k SGD is proposed in [31] to minimize the convergence error under the constraint of communication cost.Rand-k sparsifciation and stochastic uniform quantization are considered in[32].

    1.2 Contributions and Organization

    This paper addresses the problem of communication compression in federated learning.We use sparsifciation and quantization jointly for local gradient compression to relieve communication bottleneck.We derive a closed form of the communication compression with respect to the sparsifciation and quantization parameters.Note that gradient compression results in information loss and thus distorts the training performance.Therefore,we study quantization resource allocation to minimize convergence upper bound by optimizing quantization levels of devices.The main contributions of this paper are summarized as follows.

    ?Closed-Form Communication Compression.We derive a closed-form communication compression in terms of sparsifciation and quantization factors,and the correlation between the two factors is analyzed mathematically.It is shown that sparsifciation and quantization are complementary for communication compression.

    ?Convergence Rate Analysis.We derive the upper bound of the convergence rate under joint sparsifciation and quantization.Several useful insights are found via our analysis: i)The proposed scheme converges as the growth of the number of device-server iterations;ii) When the number of devicesKbecomes larger,the convergence rate approacheswhereMis the number of device-server iterations.iii) Communication compression increases the convergence upper bound,which however is bounded by the fnietuning training parameters.

    ?Quantization Resource Allocation.Based on the derived upper bound of convergence rate,we formulate and deal with an optimization problem of minimizing the convergence upper bound by allocating quantization levels among edge devices,subject to the channel capacity that corresponds to the total communication overhead.

    In the rest of this paper,we explain the system model in Section II.Some necessary preliminaries are given in Section III.In Section IV,we formulate the communication compression based on sparsifciation and quantization.In Section V,we derive the convergence rate in terms of sparsifciation and quantization factors.An optimization problem is studied by allocating quantization levels to improve convergence performance under the constraint of channel capacity.Experimental results are presented in Section VI,and Section VII concludes the paper.

    II.SYSTEM MODEL

    We consider the federated learning system which consists ofKedge devices and a single edge server.The training procedure is illustrated in Figure 1.Edge devices use their local data to train the local models and calculate the stochastic gradient of local models.The participating devices share the total wireless uplink channel which is evenly divided intoKorthogonal subchannels.Each subchannel is provided for a device to transmit its compressed gradient.The edge server receives all local compressed stochastic gradients from devices and aggregates them by averaging.This aggregated gradient is used for global model update.Then,the new global model is broadcast to all the participating devices for their local model update.The exchange between the server and devices is repeated until convergence.

    Figure 1.The training procedure of communication-efficient federated learning.

    In this paper,we assume thatθis the parameter of global model trained at the edge server,Dkdenotes the local dataset of devicekand|Dk|denotes the size ofDk.The local loss function of devicekonθis as follows

    which is the average of the sample-wise loss functionf(θ,xi,yi).f(θ,xi,yi) quantifeis the prediction deviation ofθon the training samplexiw.r.t its labelyi.The global loss function underθcan be given as

    The training aims to fnid the minimum of global loss functionF(θ),and the minimum can be mathematically written as

    To getθ*,the descent direction ofF(θ)is needed.SGD is proved to be effciient and well-behaved in large-scale dataset for searching the minimum of loss function,and then widely used in federated learning.In each iteration,for example,them-th iteration,each device uses the received global modelθ(m)to update the local loss function defnied in (1),and calculates the local stochastic gradient by using part of the local dataset.For devicek,the stochastic gradient offk(θ(m))can be calculated as

    where?denotes the derivation operation,is a part of local dataset of deviceskwith sizeb.

    In practice,the local stochastic gradients consist of high-dimensional vectors or arrays,which even have millions of components.The limited battery capacity and wireless bandwidth of devices cannot satisfy the demand of fast transmitting these vectors/arrays.As a consequence,the communication bottleneck occurs,which calls for gradient compression before transmission.

    Towards this regard,we frist sparsify each stochastic gradient and then quantize the remaining nonzero components by QSGD.This procedure is denoted as

    whereTq(·) denotes the sparsifciation function in whichqbiggest components of a gradient are remained and transmitted while the rest are replaced by zero and neglected,andQs(·) is a quantization function with quantization levels.After receiving all the compressed local stochastic gradients,the edge server aggregates them to get the approximated global gradient as

    whereηdenotes the learning rate (or step size).The updated global modelθ(m+1)is then broadcast to devices.The server-device iteration is completed until the global loss functionF(θ)is minimized.

    III.PRELIMINARIES

    In this section,we introduce some preliminaries used in our paper,including the concept of QSGD and Elias coding.

    3.1 QSGD

    In the quantization literature,the one-bit quantization and signSGD use 1 bit to quantize.Terngrad quantizes the positive,zero and negative components to +1,0 and -1,respectively.The above methods have fxied quantization degree.Thus,we select QSGD [16] as the quantization method in our scheme.It can adjust quantization degree by the adjustable parameter quantization levels.Furthermore,QSGD achieves a compromise between quantization variance and training accuracy.Some necessary concepts of QSGD is introduced as follows.

    Let‖v‖2denote the L2 norm of vectorv,which is also the length of vectorv.For any scalarr ∈R,we letsgn(r)∈{-1,+1}whilesgn(0)=1,which means that the signs of the positive and negative scalars are+1 and-1,respectively.

    QSGD belongs to scalar quantization,which quantizes the vector components to scalar codes respectively.Moreover,the codes are located uniformly in one-dimensional coordinate axis.An example is shown in Figure 2.We defnie a quantization functionQs(v),wheres ≥1 denotes the adjustable quantization level.In that case,the quantization interval is divided intos-1 uniform subintervals and endpoints of these subintervals are the codes.

    Figure 2.An example of QSGD with s=5,where the point 0.6 is quantized to 0.5 with probility p and 0.75 with probility 1-p.

    For any non-zero vectorv ∈Rn,Qs(v) is defnied as

    whereviis thei-th component of vectorv,ξi(v,s)is an independent random variable which is the map ofand defnied as

    wherejis an integer which makes sure that0≤j <s.p(a,s) is a probability function andp(a,s)=as-jfor anya ∈[0,1].The above quantization approach is unbiased and introduces minimal variance.That is,ξi(v,s) has minimal variance inThen,two properties of QSGD are given as follows

    3.2 Elias Coding

    In practice,stochastic gradient usually contains several high-dimensional arrays.We flatten these arrays,clip them to be vectors with the same lengthI,and useElias codingto encode the stochastic gradient after sparsifciation and quantization.

    In QSGD,vectorv’s codeQs(v) can be expressed by tuple (‖v‖2,σ,ζ),whereσis a sign vector containing signs of all components ofv;ζis a vector and itsi-th component is an integer whose value iss·ξi(v,s).If the code of a vector can be accurately expressed by the tuple,the receiver can recover the initial vector after receiving the code.The so-called Elias coding method can encode this tuple[16].Given any tuple (‖v‖2,σ,ζ) with quantization levels,the coding method is described as follows.First,32 bits is used to encode‖v‖2.Next,it continues to encode the information ofσandζ.To begin with,the position of the frist non-zero entry ofζis encoded,and then a bit is appended to denote the sign of this entry,i.e.σi,followed byElias(s·ξi(v,s)).Iteratively,it proceeds to encode the distance of the next non-zero entry ofζto the current non-zero entry,then encode the nextσiands·ξi(v,s)in the same way.The decoding scheme is the inverse process of encoding: we frist read off the frist 32 bits to reconstruct‖v‖2,then use decoding method to read off the positions and values of the non-zero entries ofζandσiteratively.

    IV.TOP-q BASED SPARSIFICATION

    We adopt Top-qbased sparsifciation before QSGD.Note that Top-qsparsifciation is a well studied method.We do not focus the method itself but the transmission bits reduction under this method.Defnie the sparsifciation function asTq,in which we set a threshold such that only components larger than the threshold can go to the QSGD function.In Top-qsparsifciation,qcomponents with the largest absolute value are preserved.For a vectorv ∈RI,itsi-th component is defnied by

    wherethris theq-th largest absolute value ofv’s components.

    The transmission bits are related to the number of non-zero components of vectors and quantization levels.Assume that devicek’s local stochastic gradientgkhasPtuples(‖v‖2,σ,ζ)corresponding toPvectors.From Theorem 3.2 in reference[16],the number of transmission bits can be given as

    We defnieβ ∈(0,1]as the fraction of the non-zero components after sparsifciation.That is

    The reduction of communication overhead in bits w.r.t.sparsifciation factorβand quantization levelskis derived as follows

    Theorem 1.For given β and sk,the reduction of communication overhead in bits are

    where β denotes the percentage of the remaining components after sparsification,sk denotes quantization level of device k,I denotes the length of initial vectors,and P means that gradient gk has P vectors.

    Proof.Please see Appendix A.

    Next,we analyze the relationship about the overhead reductionBkversus the sparsifciation factorβand the quantization levelsk,respectively.Without loss of generality,we ignore the constant inBkand rewriteBkas

    4.1 Bk Versus β

    For givensk,we can obtain the following result aboutBkversusβ.

    Proposition 1.There exit two quantization levels sk1and sk2,such that when sk ∈[1,sk1)or sk ∈[sk2,+∞),Bk decreases monotonically as β increases;when sk ∈[sk1,sk2),Bk first decreases and then increases monotonically as β increases.

    Proof.Please see Appendix B

    From Proposition 1,it observes that the quantization levelskaffects the relationship between communication overhead andβsignifciantly.In the range of[1,sk1]and[sk2,+∞),Bkis dominated byβand decreases as the growth ofβmonotonically.In[sk1,sk2],the effects ofβandskreach a compromise and thusBkhas an inflection point.We try to explain this phenomenon.It is intuitive that larger threshold(smallerβ)reduces more transmission bits.Whenskis small,βtakes the leading role and thusBkdecreases asβincreases.Then,larger quantization levelskneeds more transmission bits,which magnifeis the influence ofβ.As a result,larger quantization levelskmagnifeis the influence of compression and thus reduces more transmission bits.This opposite effect exactly offsets the impact ofβand is the reason of inflection point.Furthermore,as the growth ofβ,the influence of growingskbecomes limited andβtakes back the control.Bkdecreases monotonically as the growth ofβ.

    4.2 Bk Versus sk

    For any givensk3>sk4andβ,we have

    If we wantBk(β,sk3)-Bk(β,sk4)>0,sk3andsk4should satisfy

    Then we can obtain the following result aboutBkversusskas follows.

    Proposition 2.For any given β,there always exits a quantization level sk5,such that when sk ∈[1,sk5),Bk decreases monotonically;when sk ∈(sk5,+∞),Bk increases monotonically.

    Proposition 2 shows that the correlation betweenβandskstill exists.

    V.CHANNEL-AWARE QUANTIZATION

    To further compress the communication overhead,we need to choose the quantization level to quantize the remaining non-zero components after sparsifciation.

    5.1 Learning Convergence Analysis

    Based on the system model and the above preliminaries,we study the learning convergence rate of federated learning with communication compression.Before the analysis,we follow the stochastic optimization literature and give some general assumptions as in[20].

    Assumption 1 (Lower Bound).For allθand a constantF*,we have that the global objective valueF(θ)≥F*.

    Assumption 2 (Smoothness).For global objectiveF(θ),letdenote the gradient at pointθ=[θ1,θ2,···,θI]T.Then?αandα=[α1,α2,···,αI]T,for some nonnegative constant vectorl=[l1,l2,···,lI]T,

    Assumption 3 (Variance Bound).The stochastic gradientg(θ) is unbiased and it has coordinate bounded variance:whereσg=[σg1,σg2,···,σgI] is a constant vector with non-negative components.

    Based on the above assumptions,we can get the following result.

    Theorem 2.When sparsification and quantization are utilized in federated learning,the convergence result is

    where M denotes the number of global iterations,denotes the square of the dynamic range of gradient vectors’components of device k,K denotes the number of participating devices,F(0)denotes the initial objective value,F*denotes the minimum of the objective defined in Assumption 1,l0=‖l‖∞with l defined in Assumption 2,β denotes the sparsification factor,sk denotes the quantization level of device k,η denotes the learning rate and I denotes the length of the stochastic gradient.

    Proof.Please see Appendix C.

    Through Theorem 2,we have several observations as follows.First,the increase of the number of global iterations (or communication rounds)Mleads to the convergence of federated learning.Second,to guarantee the convergence,the learning rateηshould satisfy the following conditions: a) due to feasibility,to make the convergence rate bounded,should be close to 1 as soon as possible.These requirements can be satisfeid during the training by adjusting the learning rateη,sparsifciation factorβand the gradient lengthI.Third,the more devices,the lower the upper bound.This is called multi-user gain.More participating devices can provide more local data for learning,and make the aggregated stochastic gradient closer to the central true gradient.

    We note that the higher quantization precision,the better convergence performance.This result can be seen from Theorem 2.As the growth of quantization levelsk,the convergence upper bound decreases.This observation is in accordance with the fact that higher quantization level means less quantization loss and thus better training performance.We also prove the relationship between the convergence rate and sparsifciation factor.

    Proposition 3.The convergence upper bound decreases monotonically as the sparsification factor β grows.

    Proof.Please refer to Appendix D.

    The conclusion is also expected,i.e.,largerβmeans more non-zero components are preserved after sparsifciation,so the edge server can obtain more data/information for training.

    5.2 Quantization Resource Allocation

    In this subsection,we propose a quantization level allocation scheme based on channel dynamics to improve the communication effciiency.To obtain more clear insights of quantization level on the training performance,we consider the given sparsifciation factorβand setβ=1 here.

    Based on the discussion that more transmitted data/information can improve learning accuracy but cause heavier communication overhead,we optimize quantization levelskto strike the compromise between training behavior and communication bottleneck relief.From Theorem 2,it is observed that decreasing the upper bound(or improving the learning(accuracy))is equivalent to minimize the term ofby ignoring the constants.

    First,we simplify the form of the number of transmission bits afterElias codingin QSGD [16] under the practical assumptionI ?sk:

    wherecis a constant related to the dimension of the original vectorI.For devicek,transmission bits after coding should be no more than the bits transmitted at the transmission rate corresponding to its subchannel capacity during the same period.That is

    whererkdenotes the Shannon capacity asrk=HereWandPindicate the bandwidth and the transmit power,respectively,hkdenotes the channel fading of devicek,andN0is the power of channel noise.

    Then,we formulate the optimization problem of quantization resource allocation as follows

    Constraint(22a)assures that the total transmission bits are under the channel capacity of MAC.If we relaxskto be a positive real number,the objective function becomes a convex function and the constraint is a convex set,so P1 becomes a convex optimization problem.The Lagrangian function is

    whereγis the non-negative Lagrangian multiplier.According to the following KKT conditions

    SinceL1(0)<0 andL1(+∞)>0,thenL1(sk)only has one null point,denoted assk7,in [1,+∞),which is the optimal solution of the Lagrangian function.Since this optimization problem belongs to convex problem,sk7is also the global optimum solution.sk7can be calculated by the bisection search method.This search has a computational complexity ofO(log(N)).

    In addition,skshould be limited by devicek’s subchannel capacity.From(22b)along with(20),one can have the upper bound of each quantization level assk8:

    VI.SIMULATION RESULTS

    We consider a federated learning system with a single edge server andK=20 edge devices.The devices are uniformly distributed in the coverage of a base station (BS) which embeds the edge server.We assume channel noise is the Gaussian white noise.The simulation settings are given as follows:Our study includes two models on three datasets about image recognition tasks as [33].The frist model is a multilayerperceptron (MLP) with 2-hidden layers,which contains 200 units,respectively,and each unit uses ReLU activation.This MLP model is used for training on MNIST and Fashion-MNIST datasets.The second model is a CNN with two 5 × 5 convolution layers(the frist with 32 channels,the second with 64,each followed with a 2×2 max pooling layer),a fully connected layer with 512 units and ReLU activation,and a fnial softmax output layer.This CNN model is used for training on CIFAR-10 dataset.After training the classifeir model on training dataset,the classifeir will be evaluated by the testing dataset to prove its reliability.The test accuracy and training loss serve as metrics to verify the effciiency and reliability of the proposed scheme.

    6.1 Performance Comparison

    To evaluate the performance of the proposed scheme,we consider SGD and signSGD as benchmarks.All the schemes are in the same experimental settings and the power of the Gaussian white noise is -100 dBm.The curves of the test accuracy versus the number of global epochs(or communication rounds) and that of the training loss are shown in Figure 3.We can observe that SGD performs best,since there is no communication compression and thus the local gradients are transmitted to the server with the most reliability among the three schemes.Second,the proposed scheme achieves higher performance compared with the signSGD because of the higher quantization accuracy.

    Figure 3.Performance comparison between different schemes.

    6.2 Robustness to Noise

    Note that the proposed scheme allocates the quantization levels based on channel dynamics of devices.To evaluate the robustness to channel noise,we investigate the performance of three schemes on MNIST dataset over different noise powers.In Figure 4 and Figure 5,we fnid that when noise power is small,all the schemes can maintain the training performance.However,when the noise power becomes larger,the performances of SGD and signSGD are degraded seriously while the proposed scheme still keeps the high test accuracy and low test loss.The reason for performance degradation of SGD and signSGD schemes is that their quantization levels are not adaptive to channel conditions,i.e.their quantization levels are constant during the whole training.The robustness to channel dynamics of the proposed scheme is proved.

    Figure 4.Accuracy comparison under different noise power.

    Figure 5.Loss comparison under different noise power.

    6.3 Overhead Reduction

    We assume that the length of vectorIis 10000 and devicek’s local stochastic gradient hasP=6 arrays.Based on this,we investigate the relationship between the communication overhead reduction (bits)Bkand its two variables.The curve ofBkversusβis shown in Figure 6 and the curve ofBkversusskis shown in Figure 7.From Proposition 1 and Proposition 2,we fgiure out thatβandskhave complementary influences onBk.As forβ,Bkdecreases as it increases whenskis in the ranges of[1,sk1)and[sk2,+∞)while has an inflection point whenskbelongs to[sk1,sk2).Consideringsk,Bkalways decreases and then increases monotonically as the growth ofskno matter howβis valued.The above observations are in accordance with the theoretical analysis.

    Figure 6.The relationship of Bk versus β.

    Figure 7.The relationship Bk versus sk.

    6.4 Adaptivity to Channel Condition

    The curves of test accuracy and training loss under different channel capacities are shown in Figure 8.The larger channel capacity,the better training performance.The reason is that larger channel capacity allows devices to quantize stochastic gradients more precisely by using larger quantization levelsk.

    Figure 8.Performance comparison under different channel capacities.

    VII.CONCLUSION

    In this paper,we focused on compressing communication overhead in federated learning over wireless networks by sparsifying and quantizing local stochastic gradients.Then,we analyzed the relationship of communication overhead reduction versus sparsifciation and quantization factors.The convergence rate of the proposed scheme was derived.Moreover,we proposed a quantization resource allocation scheme to enhance the learning performance.

    ACKNOWLEDGEMENT

    This paper was supported in part by the National Key Research and Development Program of China under Grant 2020YFB1807700 and in part by the National Science Foundation of China under Grant U200120122.

    APPENDIX

    A Proof of Theorem 1

    After sparsifciation,the remaining components areβpercent of the initial vector’s components as shown in(13).Bringing(13)into(12),the transmission bits of thep-th vector after sparsifciation with quantization levelskis as follows:

    Therefore,the reduction of the communication overhead is

    According to the defniition in (9),we know thatξi(v,s)=0 ifQs(vi)=0.Then,

    Until now we prove Theorem 1.

    B Proof of Proposition 1

    Then the value ofZis correlated withsk.We relaxskto real number and calculate the derivations ofZtosk,where there exist at least the frist and second derivations and they are both continuous:

    C Proof of Theorem 2

    Taking Assumption 2,we can get

    Next,extending the expectation over randomness in the trajectory and performing a telescoping sum over all the iterations,we can get the lower bound ofF(0)-F*as

    Rearranging(C.14),it follows that

    Until now Theorem 2 is proved.

    D Proof of Proposition 3

    LetUdenote the convergence upper bound and make some rearrangement

    Since 0<β <1,we can know whetherU1is positive or not through the comparison betweenβ2and 1.Considering

    国产1区2区3区精品| 国语自产精品视频在线第100页| 欧美精品亚洲一区二区| 婷婷六月久久综合丁香| 久久精品91无色码中文字幕| 国产v大片淫在线免费观看| 窝窝影院91人妻| 日本熟妇午夜| 2021天堂中文幕一二区在线观 | 国产成人一区二区三区免费视频网站| 久久 成人 亚洲| 亚洲中文字幕一区二区三区有码在线看 | 丝袜人妻中文字幕| www.熟女人妻精品国产| 曰老女人黄片| 亚洲国产精品999在线| 免费看a级黄色片| 久久久久久久久久黄片| 男女做爰动态图高潮gif福利片| 免费在线观看影片大全网站| 999久久久精品免费观看国产| 侵犯人妻中文字幕一二三四区| 国产av又大| 免费看十八禁软件| 成年人黄色毛片网站| 国产极品粉嫩免费观看在线| 日本撒尿小便嘘嘘汇集6| 99久久综合精品五月天人人| 正在播放国产对白刺激| 淫妇啪啪啪对白视频| 免费在线观看视频国产中文字幕亚洲| 精品人妻1区二区| 女性生殖器流出的白浆| 三级毛片av免费| 啦啦啦观看免费观看视频高清| 亚洲人成网站在线播放欧美日韩| 日本黄色视频三级网站网址| 中文在线观看免费www的网站 | 嫩草影院精品99| 中亚洲国语对白在线视频| 中出人妻视频一区二区| 日韩精品中文字幕看吧| 国产欧美日韩一区二区三| 亚洲一码二码三码区别大吗| 国产精品永久免费网站| 亚洲一区二区三区色噜噜| 日本精品一区二区三区蜜桃| 18禁黄网站禁片午夜丰满| 脱女人内裤的视频| 亚洲专区中文字幕在线| 免费高清在线观看日韩| 夜夜夜夜夜久久久久| 91九色精品人成在线观看| 日韩精品青青久久久久久| 久久久久久久午夜电影| 日韩一卡2卡3卡4卡2021年| 看黄色毛片网站| 91字幕亚洲| 19禁男女啪啪无遮挡网站| 亚洲一卡2卡3卡4卡5卡精品中文| 午夜日韩欧美国产| 日韩成人在线观看一区二区三区| 国产精品久久久久久人妻精品电影| 久久久久国产一级毛片高清牌| 两性午夜刺激爽爽歪歪视频在线观看 | 久久精品人妻少妇| 成人免费观看视频高清| 亚洲人成伊人成综合网2020| 欧美 亚洲 国产 日韩一| www.自偷自拍.com| av在线天堂中文字幕| 啦啦啦免费观看视频1| 亚洲狠狠婷婷综合久久图片| 亚洲人成伊人成综合网2020| 国产免费男女视频| 搡老岳熟女国产| 亚洲一码二码三码区别大吗| 成熟少妇高潮喷水视频| 手机成人av网站| 两个人免费观看高清视频| 一级片免费观看大全| √禁漫天堂资源中文www| 国产精品免费视频内射| www.999成人在线观看| 国产欧美日韩一区二区精品| 亚洲av片天天在线观看| 亚洲国产毛片av蜜桃av| 精品久久久久久久末码| 午夜免费鲁丝| 99国产极品粉嫩在线观看| 久久久久久大精品| 国产国语露脸激情在线看| 国产亚洲精品综合一区在线观看 | or卡值多少钱| 亚洲成国产人片在线观看| 不卡av一区二区三区| 国产又色又爽无遮挡免费看| 精品免费久久久久久久清纯| 亚洲欧洲精品一区二区精品久久久| 国产视频一区二区在线看| 88av欧美| 操出白浆在线播放| 国产精品亚洲美女久久久| 18禁黄网站禁片免费观看直播| av视频在线观看入口| 日韩欧美在线二视频| 亚洲熟女毛片儿| 中文字幕人妻丝袜一区二区| 亚洲人成伊人成综合网2020| 免费在线观看视频国产中文字幕亚洲| 免费av毛片视频| 国产av又大| 热re99久久国产66热| 亚洲精品一区av在线观看| 一夜夜www| 亚洲国产中文字幕在线视频| 麻豆久久精品国产亚洲av| av有码第一页| 中出人妻视频一区二区| 亚洲国产高清在线一区二区三 | 天堂影院成人在线观看| 日本免费一区二区三区高清不卡| 欧美性猛交╳xxx乱大交人| 国产视频内射| 国语自产精品视频在线第100页| 久久中文看片网| tocl精华| 好看av亚洲va欧美ⅴa在| 巨乳人妻的诱惑在线观看| 亚洲五月天丁香| 久久热在线av| 精品人妻1区二区| 欧美成人性av电影在线观看| 最近最新中文字幕大全电影3 | 一区二区三区精品91| 窝窝影院91人妻| 亚洲 国产 在线| 国产亚洲精品久久久久5区| www.精华液| 久久精品91蜜桃| av有码第一页| 特大巨黑吊av在线直播 | 后天国语完整版免费观看| 久久久久国产一级毛片高清牌| aaaaa片日本免费| 国产精品,欧美在线| 国产av不卡久久| 国产v大片淫在线免费观看| 高潮久久久久久久久久久不卡| 好男人电影高清在线观看| 免费观看人在逋| 久久伊人香网站| 亚洲电影在线观看av| av有码第一页| 国产v大片淫在线免费观看| 日本成人三级电影网站| videosex国产| 欧美中文综合在线视频| 日本免费一区二区三区高清不卡| 免费搜索国产男女视频| 日日夜夜操网爽| 亚洲av成人不卡在线观看播放网| 母亲3免费完整高清在线观看| 观看免费一级毛片| 国产野战对白在线观看| 午夜福利在线观看吧| 亚洲 欧美一区二区三区| 最好的美女福利视频网| 一本久久中文字幕| 久久久久免费精品人妻一区二区 | 久久精品91无色码中文字幕| 亚洲国产欧美网| 欧美在线黄色| 国产精品美女特级片免费视频播放器 | 亚洲一卡2卡3卡4卡5卡精品中文| 久久久久国产精品人妻aⅴ院| 久久久久久久久久黄片| 桃色一区二区三区在线观看| 亚洲中文字幕一区二区三区有码在线看 | 伊人久久大香线蕉亚洲五| 亚洲一码二码三码区别大吗| 啦啦啦韩国在线观看视频| 欧美av亚洲av综合av国产av| 别揉我奶头~嗯~啊~动态视频| 国产成+人综合+亚洲专区| 色综合亚洲欧美另类图片| 18禁黄网站禁片午夜丰满| 757午夜福利合集在线观看| 亚洲第一青青草原| 国产单亲对白刺激| 亚洲国产欧美日韩在线播放| 亚洲国产精品成人综合色| 亚洲欧美日韩高清在线视频| 亚洲精品久久成人aⅴ小说| 国产熟女午夜一区二区三区| 亚洲午夜精品一区,二区,三区| 免费高清视频大片| 熟女少妇亚洲综合色aaa.| 色哟哟哟哟哟哟| 免费看美女性在线毛片视频| 50天的宝宝边吃奶边哭怎么回事| 一级a爱片免费观看的视频| 两性夫妻黄色片| 国产免费男女视频| 女性生殖器流出的白浆| 国产又黄又爽又无遮挡在线| 亚洲成人精品中文字幕电影| 亚洲人成电影免费在线| 一级a爱视频在线免费观看| 国产午夜精品久久久久久| 婷婷丁香在线五月| 久久精品人妻少妇| 亚洲五月婷婷丁香| 精品福利观看| 女性生殖器流出的白浆| 观看免费一级毛片| 久久精品夜夜夜夜夜久久蜜豆 | 欧美成人免费av一区二区三区| 在线观看www视频免费| 国内毛片毛片毛片毛片毛片| 久久人妻av系列| 色在线成人网| 无人区码免费观看不卡| 亚洲专区国产一区二区| 免费女性裸体啪啪无遮挡网站| 我的亚洲天堂| 丰满的人妻完整版| 女人被狂操c到高潮| 亚洲电影在线观看av| 男女之事视频高清在线观看| 国产激情偷乱视频一区二区| 久久精品aⅴ一区二区三区四区| 国产精品美女特级片免费视频播放器 | 成熟少妇高潮喷水视频| 大型黄色视频在线免费观看| 午夜精品在线福利| 国产精品乱码一区二三区的特点| 亚洲av电影在线进入| 国内久久婷婷六月综合欲色啪| 自线自在国产av| 欧美成人一区二区免费高清观看 | 久9热在线精品视频| 一进一出抽搐动态| 操出白浆在线播放| 欧美国产精品va在线观看不卡| 天天躁狠狠躁夜夜躁狠狠躁| a级毛片a级免费在线| 黄色片一级片一级黄色片| 欧美乱色亚洲激情| 亚洲国产精品sss在线观看| 久久精品91无色码中文字幕| 精品一区二区三区av网在线观看| 男女午夜视频在线观看| 两个人视频免费观看高清| 亚洲天堂国产精品一区在线| 老司机靠b影院| 亚洲av电影在线进入| 首页视频小说图片口味搜索| 侵犯人妻中文字幕一二三四区| 女性生殖器流出的白浆| 欧美中文综合在线视频| 国产高清视频在线播放一区| av在线天堂中文字幕| 国产成人欧美在线观看| 一本综合久久免费| 久久欧美精品欧美久久欧美| 色综合站精品国产| 在线看三级毛片| 久久久久国产精品人妻aⅴ院| 久久天堂一区二区三区四区| 午夜精品在线福利| 看片在线看免费视频| 欧美av亚洲av综合av国产av| 亚洲黑人精品在线| 午夜久久久久精精品| 丝袜在线中文字幕| 97碰自拍视频| 日韩欧美一区视频在线观看| 视频在线观看一区二区三区| 禁无遮挡网站| 亚洲黑人精品在线| 可以免费在线观看a视频的电影网站| 女人爽到高潮嗷嗷叫在线视频| 亚洲美女黄片视频| 欧美黑人巨大hd| 黄色a级毛片大全视频| 18禁观看日本| 久久久国产精品麻豆| 首页视频小说图片口味搜索| 国产精品一区二区精品视频观看| 国产精品,欧美在线| 久99久视频精品免费| 亚洲专区字幕在线| 午夜福利一区二区在线看| 国产色视频综合| 精品国产国语对白av| 18禁黄网站禁片免费观看直播| 麻豆av在线久日| 搞女人的毛片| 日本免费a在线| 波多野结衣巨乳人妻| 99精品欧美一区二区三区四区| 国产精品 欧美亚洲| 亚洲第一av免费看| 成年女人毛片免费观看观看9| 国产精品久久久久久亚洲av鲁大| 国产精品美女特级片免费视频播放器 | 久久午夜亚洲精品久久| 不卡一级毛片| 麻豆国产av国片精品| 熟女电影av网| 嫁个100分男人电影在线观看| 手机成人av网站| 日本免费a在线| 精品国产国语对白av| 国产一区二区激情短视频| 欧美一级a爱片免费观看看 | 国产极品粉嫩免费观看在线| 亚洲专区中文字幕在线| 99久久国产精品久久久| 波多野结衣高清无吗| 久久久国产成人免费| 精品国产超薄肉色丝袜足j| av超薄肉色丝袜交足视频| 超碰成人久久| 一夜夜www| 一本综合久久免费| 国产伦人伦偷精品视频| 日韩国内少妇激情av| 香蕉丝袜av| 久久久久久久久中文| 国产av不卡久久| 午夜免费鲁丝| 亚洲精品粉嫩美女一区| 国产激情偷乱视频一区二区| 亚洲精品美女久久av网站| 黄片播放在线免费| 一区二区日韩欧美中文字幕| 午夜久久久在线观看| 中文字幕人成人乱码亚洲影| 十分钟在线观看高清视频www| 人人妻,人人澡人人爽秒播| 国产麻豆成人av免费视频| 午夜福利高清视频| 国产麻豆成人av免费视频| 午夜福利高清视频| 午夜激情福利司机影院| 2021天堂中文幕一二区在线观 | 午夜福利一区二区在线看| 欧美中文综合在线视频| 他把我摸到了高潮在线观看| 久久国产精品人妻蜜桃| 深夜精品福利| 可以在线观看毛片的网站| 国产av不卡久久| 日韩大尺度精品在线看网址| 99在线人妻在线中文字幕| 亚洲欧美日韩高清在线视频| 欧美激情极品国产一区二区三区| 热re99久久国产66热| 成人亚洲精品一区在线观看| 国产熟女xx| 国产精品日韩av在线免费观看| 一级毛片女人18水好多| 国产一区二区在线av高清观看| 欧美最黄视频在线播放免费| 欧美日韩中文字幕国产精品一区二区三区| 亚洲av成人一区二区三| 国产精品av久久久久免费| 色综合婷婷激情| videosex国产| av电影中文网址| av在线天堂中文字幕| 美国免费a级毛片| 99国产综合亚洲精品| 在线观看舔阴道视频| 亚洲狠狠婷婷综合久久图片| 老司机深夜福利视频在线观看| 黄频高清免费视频| 午夜激情av网站| 色精品久久人妻99蜜桃| 日本黄色视频三级网站网址| 久久香蕉激情| 婷婷精品国产亚洲av在线| 国内精品久久久久久久电影| 亚洲国产毛片av蜜桃av| 日韩欧美在线二视频| 丰满的人妻完整版| 亚洲欧美日韩高清在线视频| videosex国产| 国产免费男女视频| 日本 欧美在线| 在线观看66精品国产| 亚洲黑人精品在线| 亚洲三区欧美一区| 国产欧美日韩一区二区精品| 此物有八面人人有两片| 国产单亲对白刺激| 两个人免费观看高清视频| 精品久久久久久久久久久久久 | 成人亚洲精品av一区二区| 国内揄拍国产精品人妻在线 | 国产一区在线观看成人免费| 日韩欧美在线二视频| 丰满的人妻完整版| 精品国产乱子伦一区二区三区| 哪里可以看免费的av片| 亚洲av片天天在线观看| 狠狠狠狠99中文字幕| 成年版毛片免费区| 在线观看免费日韩欧美大片| 国内少妇人妻偷人精品xxx网站 | 美女 人体艺术 gogo| 神马国产精品三级电影在线观看 | 两性午夜刺激爽爽歪歪视频在线观看 | 日韩高清综合在线| 日韩欧美 国产精品| 午夜激情福利司机影院| 18禁黄网站禁片免费观看直播| 午夜久久久在线观看| 1024香蕉在线观看| 成人一区二区视频在线观看| 国产一区二区三区在线臀色熟女| 女性被躁到高潮视频| 老鸭窝网址在线观看| 一级a爱视频在线免费观看| 国产色视频综合| 国产爱豆传媒在线观看 | 亚洲一区二区三区不卡视频| 亚洲一码二码三码区别大吗| 亚洲成人免费电影在线观看| 久9热在线精品视频| 国产精品久久久久久亚洲av鲁大| 91麻豆av在线| 国产一卡二卡三卡精品| 看片在线看免费视频| 国内毛片毛片毛片毛片毛片| 制服丝袜大香蕉在线| 法律面前人人平等表现在哪些方面| 国产91精品成人一区二区三区| 91老司机精品| 欧美日韩中文字幕国产精品一区二区三区| 欧美 亚洲 国产 日韩一| 亚洲国产欧美一区二区综合| 欧美亚洲日本最大视频资源| 精品一区二区三区四区五区乱码| 国产又爽黄色视频| 国产aⅴ精品一区二区三区波| 99在线人妻在线中文字幕| netflix在线观看网站| 国产高清有码在线观看视频 | 自线自在国产av| 国产单亲对白刺激| 丰满的人妻完整版| 黄色毛片三级朝国网站| 51午夜福利影视在线观看| 一二三四在线观看免费中文在| 香蕉国产在线看| 97碰自拍视频| 精品久久久久久成人av| 国产成年人精品一区二区| 精品久久久久久久人妻蜜臀av| 欧美日韩亚洲综合一区二区三区_| 曰老女人黄片| 好男人在线观看高清免费视频 | 欧美国产日韩亚洲一区| 亚洲熟妇熟女久久| 香蕉久久夜色| 亚洲精华国产精华精| 国产成人欧美| 女人被狂操c到高潮| 少妇被粗大的猛进出69影院| 精品国内亚洲2022精品成人| 黄频高清免费视频| 成人特级黄色片久久久久久久| 黄色片一级片一级黄色片| 18禁黄网站禁片免费观看直播| 级片在线观看| a级毛片a级免费在线| 激情在线观看视频在线高清| 欧美色视频一区免费| 午夜福利在线在线| 久久伊人香网站| 国产黄a三级三级三级人| 熟妇人妻久久中文字幕3abv| 丝袜人妻中文字幕| 国产精品久久久久久亚洲av鲁大| 日韩欧美在线二视频| 国产av不卡久久| 搞女人的毛片| 青草久久国产| 久久国产乱子伦精品免费另类| 丰满人妻熟妇乱又伦精品不卡| 国产黄a三级三级三级人| 中文字幕精品免费在线观看视频| 99riav亚洲国产免费| www.自偷自拍.com| 国产久久久一区二区三区| 黄色视频不卡| 少妇熟女aⅴ在线视频| 久久九九热精品免费| 男女午夜视频在线观看| 制服人妻中文乱码| 一个人免费在线观看的高清视频| 国产精品免费视频内射| 国产成人一区二区三区免费视频网站| 日韩欧美一区视频在线观看| 精品久久久久久久人妻蜜臀av| 99国产精品一区二区蜜桃av| 悠悠久久av| 亚洲欧美精品综合久久99| 亚洲精品色激情综合| 亚洲午夜理论影院| 99riav亚洲国产免费| 国产精品影院久久| 天堂影院成人在线观看| 黄色视频,在线免费观看| 亚洲五月色婷婷综合| x7x7x7水蜜桃| 91麻豆精品激情在线观看国产| 亚洲av成人一区二区三| 日韩高清综合在线| 国产三级在线视频| xxx96com| 国产黄色小视频在线观看| 妹子高潮喷水视频| 女人被狂操c到高潮| 精品乱码久久久久久99久播| 亚洲一区二区三区不卡视频| 午夜福利视频1000在线观看| 亚洲欧美日韩无卡精品| 国产在线精品亚洲第一网站| 国产伦人伦偷精品视频| 麻豆久久精品国产亚洲av| 欧美一级毛片孕妇| 国产av不卡久久| 亚洲一区中文字幕在线| 亚洲成人久久爱视频| 亚洲片人在线观看| 一级毛片高清免费大全| 最近最新免费中文字幕在线| 欧美+亚洲+日韩+国产| 欧美国产日韩亚洲一区| 国产精品自产拍在线观看55亚洲| 国产亚洲精品久久久久久毛片| 亚洲成a人片在线一区二区| 1024香蕉在线观看| 中文字幕精品免费在线观看视频| 男人操女人黄网站| 99re在线观看精品视频| 久久国产亚洲av麻豆专区| a级毛片在线看网站| 欧美又色又爽又黄视频| 黄色丝袜av网址大全| 极品教师在线免费播放| 国产一级毛片七仙女欲春2 | 岛国视频午夜一区免费看| 老司机深夜福利视频在线观看| 好男人在线观看高清免费视频 | 国内精品久久久久精免费| 久久精品国产亚洲av高清一级| 色哟哟哟哟哟哟| 欧美日韩亚洲国产一区二区在线观看| 国产亚洲欧美在线一区二区| 黑人巨大精品欧美一区二区mp4| 国产视频内射| 亚洲久久久国产精品| 色老头精品视频在线观看| 人人妻人人澡人人看| 久久天堂一区二区三区四区| 一本大道久久a久久精品| 男人操女人黄网站| 久久精品aⅴ一区二区三区四区| 精品卡一卡二卡四卡免费| 亚洲五月婷婷丁香| 精品国产一区二区三区四区第35| 成人欧美大片| 一边摸一边抽搐一进一小说| 亚洲真实伦在线观看| 91老司机精品| 亚洲国产精品成人综合色| 国产精品影院久久| 淫妇啪啪啪对白视频| 人成视频在线观看免费观看| 国产精品影院久久| 国产成人av激情在线播放| 日韩av在线大香蕉| 精品人妻1区二区| 黄色a级毛片大全视频| 人成视频在线观看免费观看| 99国产极品粉嫩在线观看| 国产成人av激情在线播放| 六月丁香七月| 国产午夜精品久久久久久一区二区三区 | 欧美区成人在线视频| 久久精品国产99精品国产亚洲性色| 波多野结衣高清作品| 婷婷精品国产亚洲av| 日本撒尿小便嘘嘘汇集6| 亚洲熟妇中文字幕五十中出| 人人妻人人澡人人爽人人夜夜 | 男女啪啪激烈高潮av片| 老师上课跳d突然被开到最大视频| 69人妻影院| 神马国产精品三级电影在线观看| 淫妇啪啪啪对白视频| 一个人观看的视频www高清免费观看| 国产av在哪里看| 亚洲在线观看片| 亚洲婷婷狠狠爱综合网| 欧美日韩在线观看h| 亚洲av中文字字幕乱码综合| 免费观看的影片在线观看| 国内揄拍国产精品人妻在线| 91狼人影院|