• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    A Multi-Agent Reinforcement Learning-Based Collaborative Jamming System:Algorithm Design and Software-Defined Radio Implementation

    2022-10-27 04:42:24LuguangWangFeiSongGuiFangZhibinFengWenLiYifanXuChenPanXiaojingChu
    China Communications 2022年10期

    Luguang Wang,Fei Song,Gui Fang,Zhibin Feng,Wen Li,Yifan Xu,Chen Pan,Xiaojing Chu

    College of Communications Engineering,Army Engineering University of PLA,Nanjing 210000,China

    *The corresponding author,email:songfei2021123@163.com

    Abstract:In multi-agent confrontation scenarios,a jammer is constrained by the single limited performance and inefficiency of practical application.To cope with these issues,this paper aims to investigate the multi-agent jamming problem in a multi-user scenario,where the coordination between the jammers is considered.Firstly,a multi-agent Markov decision process(MDP)framework is used to model and analyze the multi-agent jamming problem.Secondly,a collaborative multi-agent jamming algorithm(CMJA)based on reinforcement learning is proposed.Finally,an actual intelligent jamming system is designed and built based on software-defined radio(SDR)platform for simulation and platform verification.The simulation and platform verification results show that the proposed CMJA algorithm outperforms the independent Q-learning method and provides a better jamming effect.

    Keywords:multi-agent reinforcement learning;intelligent jamming;collaborative jamming;softwaredefined radio platform

    I.INTRODUCTION

    In recent years,wireless jamming technology has become a key topic in spectrum security[1].Traditional jamming methods mainly include fixed frequency jamming,sweeping jamming,comb jamming and so on,which usually obeys the preset operating modes and can be easily avoided by dynamic spectrum access[2–4].Moreover,with the promotion of artificial intelligence and software radio technology,the intelligence of target system is improved and the antijamming technology is developed continuously[5–13],causing serious challenges to traditional jamming means.Thus,the intelligent jamming technology combined with machine learning has been widely studied in recent years.However,the existing works mainly focused on the single-user scenario and lack the verification of actual platform.As a result,there is a need to continuously strengthen the jamming capability in a multi-user scenario to cope with the communication confrontations in complex environments.In this paper,we investigate the problem of jamming channel selection in a multi-user scenario,where collaboration between jammers is considered.

    However,the following problems in the study of decision-making in the jamming process have still not been fully solved.Firstly,it is difficult to attack multiple users since the jamming ability of a single jammer is limited.Secondly,group intelligence confrontation is a developing trend and the existing jamming algorithms are inefficient in dealing with opponents having increasing anti-jamming ability.Finally,some researchers have proposed intelligent jamming theory algorithms but have not verified them in actual systems.Therefore,conducting research on multi-agent jamming methods and improving the overall jamming efficiency of the jammers is an effective means for addressing the lack of independent jamming effects.

    To tackle the above-mentioned problems,it is of urgency and necessary to study jamming models and intelligent algorithms of multi-agent jammers in a multiuser scenario and establish the long-term advantages of jammers in communication confrontation.However,to implement a collaborative multi-agent jamming algorithm,several key challenges need to be addressed:(i)The model should be reasonable.The reasonable model framework is the premise of subsequent learning and decision.In particular,the design of the jamming reward value is the key to optimizing the jamming decisions.(ii)The jamming policy should be effective.For different communication modes of users,it is important to ensure the effectiveness and applicability of the jamming policy.(iii)The collaboration between jammers should be efficient.Efficient collaboration between jammers can avoid the wastage of the jamming resources and maximize resource utilization.

    In a multi-user communication scenario,we analyze the jamming problem based on the multi-agent Markov decision process(MDP)framework and design the reward value of jamming evaluation.To avoid decision conflicts between jammers,we design a collaboration jamming mechanism between the jammers and propose a jamming algorithm based on multiagent reinforcement learning.The specific contributions of this paper are summarized as follows:

    ·In order to avoid conflicting decisions between jammers,a collaborative jamming algorithm(CMJA)based on multi-agent reinforcement learning is proposed,with the function of distributed calculation and collaborative decision.

    ·The simulation results verify that the proposed CMJA algorithm is effective and outperforms the independent Q-learning algorithm and the convergence proof of the proposed algorithm is given.

    ·A practical collaborative jamming system based on the software-defined radio(SDR)platform is designed and built,which contains three subsystems:an intelligent jamming system,a wireless transmission and communication subsystem and a confrontation visualization subsystem.The proposed CMJA algorithm is verified in the practical system.

    The rest of this paper is organized as follows:The related work is reviewed in Section II.The system model and problem formulation of the collaborative decision-making of multiple jammers are presented in Section III.In Section IV,the details of the proposed CMJA algorithm is described.Simulation results and a discussion on them are given in Section V.In Section VI,the multi-agent jamming system is introduced.Finally,the conclusion is conducted in Section VII.

    II.RELATED WORK

    There is no doubt that communication confrontation has attracted a great deal of attention and has become a research hotspot in recent years.In the field of antijamming communication,various methods and theories,such as game theory[5–11]and machine learning[12,13],have been proposed.Some studies[5–10]modeled and solved the interaction between users and jammers using the game theory to find the best anti-jamming decisions.In[11],the authors used the Stackelberg game to develop a model and assumed that the user owns the power decision set of the jammer.Machine learning has made significant advances in making decisions in dynamic environments.In the study by Liu et al.[12],the spectrum waterfall diagram was fed into a convolutional neural network,which used the deep reinforcement learning(DRL)algorithm to achieve the anti-jamming effect.Chen et al.[13]used the DRL algorithm to optimize the power selection in anti-jamming communications and implemented their proposed algorithm using the universal software radio peripheral(USRP)devices.It can be seen that many achievements have been made in the field of intelligent anti-jamming.On the contrary,only a few research studies on intelligent jamming have been carried out.Therefore,it is necessary to study intelligent jamming techniques to implement precision jamming.

    In[14],the authors classified the traditional jamming means into four models:constant jamming,deceptive jamming,random jamming and reactive jamming.Sweep jamming and block jamming were also considered in[15].Xu et al.[16]gave a definition of communication jamming.According to the definition,the jamming types could be divided into physical layer jamming and link layer jamming[17,18].The premise of link layer jamming is that the link layer information(protocol and frame format,etc)of the users is known.This is difficult to achieve in reality.Therefore,it is more straightforward and effective to study physical layer jamming methods.

    Based on the above-mentioned studies,various types of learning methods of jamming have been proposed[19–23].Shi et al.[19]proposed an intelligent jamming algorithm in the frequency domain,in which the jammer can sense and learn the user’s channel switching policy to achieve tracking jamming.Amuru et al.[20]proposed a power domain intelligent jamming algorithm,where the jammer can adjust its power according to the state of the users.In[21],the authors gave the jamming effect based on reinforcement learning algorithm in different antijamming strategies.Zhang et al.[22]proposed a jamming method for virtual decision-making and validated it using the USRP platform.A deep learningbased jammer used generative adversarial networks to achieve accurate jamming with a limited number of samples in[23].However,the above works are all based on a single jammer,which is inefficient in dealing with multi-agent confrontation scenes.

    It is important to study communication confrontation of multi-agent cooperation.Some studies have applied multi-agent reinforcement learning to antijamming scenarios[24–27].Smart users can avoid mutual interference and external malicious jamming signals through cooperation,which enhances the antijamming ability.At present,the research on collaborative jamming is mainly aimed at friendly jamming to ensure the security of one’s own communication when facing an eavesdropping enemy[28–30].Some literatures on multiple jammers mainly focus on cooperative spoofing for radar detection.In[31],the authors investigated a power allocation game theoretic problem between a radar system and multiple jammers to determine the optimal power allocation.Chang et al.[32]proposed a novel jamming problem modeling idea to estimate the optimal jamming amplitude.To confront the threat of radar-net,an artificial bee colony based jamming resource allocation algorithm was proposed in[33].In[34],the cooperative perception of electromagnetic information of radiation source target is realized by using the advantage of multiple jammers’information sharing.However,in the research on the work of multiple jammers,there is relatively little research on the selection of jamming channels to destroy the opponent’s communications.Therefore,the research content of this paper is the collaborative jamming of multiple jammers to one’s opponent,which destroy the normal communication of one’s opponent by selecting jamming channels.

    Figure 1.Schematic of the system model.

    To sum up,the existing studies mainly focused on the intelligent jamming technology of a single jammer,and did not consider the collaborative jamming using multiple jammers.Therefore,we are inspired to study the collaborative jamming problem of multiagent jammers.For the same,we propose a collaborative jamming algorithm based on multi-agent reinforcement learning and verify it in an actual system.

    III.SYSTEM MODEL AND PROBLEM FORMULATION

    3.1 System Model

    The system model is shown in Figure 1 and considers a scenario withMintelligent jammers andNcommunication users(transmitter-receiver pairs).The smart jammer has an intelligent decision-making capability and consists of a spectrum-sensing subsystem and a jamming-decision subsystem.The set of jammers and users are denoted byM={j1,...,jM}andN={u1,...,uN},respectively.There areKavailable channels,which are denoted asK={f1,...,fK}(K≥N).We consider a time-slot system and the length of each time slot for the user and the jammer is the same.Each jammer selects a channel in each time slot to release jamming signals and each user can select only one transmission channel.

    Figure 2.The change model of user channel.

    Figure 3.The diagram of jamming time slot structure.

    We assume that the users adopt probabilistic frequency hopping models based on a preset sequence[21],which can collaborate to avoid the internal interference.

    For example,we assume that thenth user’s frequency hopping sequence isF={fn1,fn2,···,fnK-1,fnK},whereKdenotes the number of channels available.The channel of thenth user in thetth slot isfkn,and we denote this asCn(t)=fkn.So the user’s channel in the next slot can be express as:

    whereε∈(0,1)denotes the probability that the channel will remain unchanged andk′=(k+1)modKdenotes thek′th channel in frequency hopping sequenceF.Eq.(1)shows that the user remains the current channel with probability ofεand switches to the next channel according to the user’s frequency hopping sequence with probability 1-ε.

    All users collaborate to execute actions according to Eq.(1)in the same time slot.For different usersMandN,we have

    As shown in Figure 2(K=6,N=2),the horizontal axis indicates the time slot,the vertical axis indicates the available channel,the shaded area indicates the channel used by the users,and the blank area indicates the idle channel.The user’s channel will change over time.

    The intelligent jammer has sensing and learning abilities,which can sense the current communication frequency and learn the frequency usage pattern of the user to generate efficient intelligent jamming strategies.Figure 3 shows the time slot structure diagram of the jammer and the user.Each jamming time slot contains a jamming sub-slotTj,a sensing sub-slotTwss,and a learning sub-slotTl.Tjis used for releasing the jamming signals,Twssis used for sensing the wideband spectrum,andTlis used for learning locally.Tuis the user’s communication time.The length of a jamming time slot isTj+Twss+Tland the length of the user’s time slot isTu.The jamming time slot is assumed to be equal toTu.Different jammers interact with each other to make collaborative jamming decisions.

    3.2 Problem Formulation

    The working principle of a single intelligent jammer involves sensing the spectrum state and making decisions through learning,which in turn,can affect the current state.This sequential decision process is suitable for modeling with the Markov decision process.However,in the multi-agent scenario considered in this paper,any action of an agent can affect the state.Thus,we use an extension of MDP for modeling in multi-agent scenarios.

    A multi-agent MDP can be represented as:

    where the specific meaning of each element is as follows:

    ·M={j1,...,jM}denotes the set of intelligent jammers.

    ·Sdenotes the environmental state space;st∈Sis the element of the state space and indicates the environment state of the jammers.

    ·Am,m=1,...,Mdenotes the action space of the intelligent jammerjm;am∈Am,{Am=[f1,...,fK]},denotes the strategy chosen by the jammerjm.

    ·Pr:S×A1×...×AM→[0,1]denotes the state transition probability function,which represents the probability of the state moving tos′after each jammer executes its actionam∈Amin the states.

    ·rm:S×A1×...×AM→R denotes the immediate reward obtained after the jammerjmexecutes an actionam∈AMin the states.

    The state of the environment is defined as follows:

    whereun(t)∈{f1,...,fK}denotes the channel which thenth user is communicating in thetth time slot.

    Each jammer selects its jamming channel in the statest,and the independent action space of each jammer is the same:A1=A2=···=AM.The independent action space of any jammer can be expressed as:

    where the actionam∈{f1,...,fK}denotes the jamming channel of the jammerjm.The collaborative action a={a1,...,aM}denotes the combination of the jamming action between the jammers.Thus,the collaborative jamming action space can be expressed as follows:

    where?represents the Cartesian product operation.

    The transition of the state depends on the change in the user’s channel,and this is hard to predict because the behavior of the users is unknown to the jammers.

    In this paper,we consider the effect of jamming suppression quantified as the reward value.When the jammerjmtakes an actionamthat can successfully block any user channels,the independent reward value ofjmis 1,otherwise,it is 0.Considering the collaboration between the jammers when another jammerjntakes the same action,i.e.,n/=mbutan=am,the reward value minus.The joint reward value of the jammerjmin thetth slot is defined as:

    andδ(p,q)is expressed as:

    When all the jammers take a joint action a={a1,...aM},the immediate reward value of each jammer and the sum of the overall reward values can be obtained.The total reward value of the jammers taking the joint action a={a1,...aM}in the statestis expressed as:

    We define the decision policy of the jammerjmasπm,and the joint policy for all jammers asπ={π1,...,πM}.The common goal of the multi-agent is to obtain the optimal joint policyπ*={π*1,...,π*M}.Each jammer can obtain the maximum long-term cumulative discounted reward by executing the optimal policyπ*.

    Therefore,each jammer aims at maximizing the cumulative expected reward,which can be expressed as:

    wherest+τand at+τdenote the state and joint action,respectively,in thet+τtime slot.Eπ[·]is the mathematical expectation under the joint policyπ,rmis the immediate reward value that the jammerjmobtains after executing the actionπm∈π,and 0≤γ<1 denotes the discount factor for long-term rewards.

    Figure 4.Schematic showing of the collaborative multi-agent jamming framework.

    IV.COLLABORATIVE MULTI-AGENT JAMMING ALGORITHM

    4.1 Algorithm Description

    Multi-agent MDP is suitable for solving with the reinforcement learning algorithm.Q-learning algorithm is a classical model-free reinforcement learning algorithm that works on a“decision-feedback-update”mechanism[35].The Q-learning method stores all the Q-values corresponding to the“states-actions”by creating a Q-value table.The agent makes a decision based on the current state and updates the Q-value table with the obtained reward values.

    Motivated by[36],we propose a collaborative jamming algorithm based on multi-agent reinforcement learning for the multi-agent MDP model.As illustrated in Figure 4,each jammer maintains an independent Q-value table and the central server maintains a collaborative Q-value table.Each jammer updates its Q-value table based on the state it senses and the reward it gets,whereas the central server receives all the independent Q-values information to update the collaborative Q-value table and make collaborative decisions.Therefore,the process of updating the Q-value table realizes the function of“distributed calculation and collaborative decision”.The jammerjmupdates its Q-value table according to the following equation:

    whereαdenotes the learning rate andγdenotes the discount factor.Thest+1denotes the next state after the execution of the collaborative action atin the statestandrm(st,am)denotes the immediate reward of the jammerjmobtained after all the jammers take the collaborative action atin the statest.The a*denotes the collaborative action in the statest+1that causes all the jammers to obtain the maximum gain value,which is given by the following equation:

    In this case,the multi-agent Q-learning algorithm in Eq.(11)is computed in a distributed manner.Each jammer updates its Q-value individually and maintains the same collaborative Q-value table together.However,for Eq.(12),a global coordination policy with common rewards needs to be solved:

    whereQm(st,a)denotes the Q-value of the jammerjm,which can be called as the independent Q-value table.Q(st,a)denotes the collaborative Q-value table that needs to be maintained and updated by all jammers.Therefore,the update ofQ(st,a)can be transformed into updating theQm(st,a)of each jammer.According to Eq.(12),the jammers can obtain the optimal collaborative policy when all the Q-values,Qm(st,a),in the collaborative Q-table converge to the optimal value.

    To avoid the reinforcement learning algorithm from falling into the“exploration-exploitation”dilemma,we use theε-greedystrategy to balance exploration and exploitation.The jammers randomly select the joint action a∈Awith a probability ofε,and select the joint actionwith a probability of 1-ε.To achieve a smooth transition of the decision-making actions from exploration to exploitation,we designεas follows:

    whereε0denotes the initial exploration,λdenotes the rate parameter andtdenotes the iteration time.As the number of iterations increase,the value ofεgradually approaches 0 and the jammers tend to select the joint actions

    Based on the above analysis,the collaborative multiagent jamming algorithm(CMJA)is proposed and its steps are given in Algorithm 1.

    4.2 Convergence Analysis

    As discussed above,the optimal policy under each state is given by the collaborative Q-table expressed by Eq.(13).TheQm(st,a)of each jammer is calculated independently and the collaborativeQ(st,a)is their sum.The convergence of the collaborative Qlearning algorithm can be guaranteed by the convergence condition of single-agent Q-learning.Referring to the articles[36–38],the convergence condition of collaborative Q-learning is given in Theorem 1.

    Algorithm 1.Collaborative multi-agent jamming algorithm(CMJA).Initialization:S,Q(st,a),Qm(st,a);1:For:t=0,...,T do 2:Each jammer observes its current state st={u1(t),...,uN(t)}and selects a channel according to the following rules:·The jammer jm randomly chooses a channel profile a∈A with a probability of ε.·The jammer jm chooses a channel profile a*∈arg max Mimages/BZ_50_1695_932_1743_978.png Qm(s′,a′)with a probability of 1-ε.3:Each jammer calculates its reward rm(st,am).4:The state is transformed into st+1={u1(t+1),...,uN(t+1)}.Eq.(11)and Eq.(13)are used for updating the values of Qm and Q,respectively,in the state st.5:End for a′m=1

    Theorem 1.Given the bounded rewards rm and the learning rate α∈(0,1),if α satisfies:

    the agent will converge to the optimal policy as ε→0.

    Therefore,the convergence of the proposed algorithm can be guaranteed as long as the learning rateαis set to meet the above conditions.

    During the iteration process,the Q-value table is constantly updated until it can no longer be updated or the changes in it are very small,which indicates that the Q-value has converged to the optimum value and the policy made on the basis of the collaborative Q-table is the optimal policy at that moment.

    4.3 Algorithm Complexity Analysis

    Motivated by[39],we analyze the complexity of the proposed algorithm.The algorithm complexity can be expressed as O(F),which is related to the code that is executed the most number of times in the algorithm.The proposed algorithm consists of three main stages:making a collaborative decision,calculating the reward value,and updating the Q-value table.We assume that the number of jammers isM,the number of user pairs isN,and the number of channels isK.The complexity of the three stages in one time slot are analyzed as follows.

    The jammers take a collaborative decision action according to Eq.(12)and the complexity can be expressed asCd=O(ANK×KM),whereANKandKMdenote the number of rows and columns of the collaborative Q-value table,respectively.

    The complexity of the reward value calculation is based on the designed reward value in Eq.(7),which can be expressed asCr=M×O(N+M).Nindicates the number of times the jammerjmrequires to find whether there currently exists a communication channel with the same actionamas taken,andMdenotes the number of times required to calculate the immediate reward value of the remaining jammers.

    The Q-value table updating process requires updating the Q-value tables ofMjammers independently and the complexity can be expressed asCu=M×O(ANK×KM),whereANKandKMdenote the number of rows and columns of an independent Q-value table,respectively.

    We assume that the number of iteration time slots isTnum.Therefore,the complexity of the proposed CMJA algorithm can be expressed by the following equation:

    It can be concluded that the complexity of the proposed algorithm increases exponentially with the number of jammers and channels,thus making it suitable for use in small-scale scenarios.

    V.SIMULATION AND ANALYSIS RESULTS

    This section presents the simulation results.Consider a scenario with two jammers and two user pairs,i.e.,M=N=2.The users have 10 available channels,i.e.,K=10.Table 1 gives the main parameters for the CMJA algorithm.The initial values of the parameters are chosen based on empirical values and further tuning of the parameters is performed during the simulation.In particular,a compromise value for the discount factorγis chosen to balance the present and future rewards,whereas a smaller value for the learning rateαis chosen to ensure a balanced weighting of the reward values.The other main parameters are shown in Table 1.

    Table 1.Parameter values used in the simulation.

    To verify the effectiveness of the proposed CMJA algorithm,we compare it with the independent Qlearning algorithm[22].Each jammer executes the classical independent Q-learning method without considering the coordination among the jammers.In the simulation,we verify the performance of the proposed CMJA algorithm by two indexes,the jamming success rate and the normalized throughput of the users.In addition,we assume that the jammers can detect and calculate the user’s ACK messages.Therefore,we define the jamming success rate as:

    whereSsuc,jdenotes the number of packets successfully jammed andStot,jdenotes the total number of the packet count.The normalized throughput of the users is defined as:

    whereScur,tdenotes the number of packets currently being successfully transmitted andSno,jdenotes the number of packets transmitted without jamming.To make the simulation results clear and intuitive,the jamming success rate and normalized throughput of every update time slot are averaged over every 20 jamming time slots.The results are obtained by averaging over 50 independent runs.

    Figure 5.Jamming success probability in mode I.

    In the simulations,as a comparison,we assume that the users have the following two channel switching modes:

    Mode I:The users communicate with each other using a fixed sequence frequency hopping mode.

    Mode II:The users use probabilistic frequency hopping to communicate according to Eq.(1).The current channel is selected to reside with a probability of 30%and the next channel is selected with a probability of 70%.

    5.1 The Simulation Results for Mode I

    Figure 5 shows the jamming probability curve of the jammers.The simulation results show that at the beginning of the algorithm,the success rate of the proposed CMJA algorithm as well as the independent Q-learning algorithm are low and approximately the same.As time goes by and the jammers continue to learn until the policy table converges,the jamming success rate of the CMJA algorithm can achieve 100%,whereas the independent Q-learning algorithm achieves 50%percent of that.

    Figure 6.The change in the user’s throughput in mode I.

    Figure 7.Jamming success probability in mode II.

    Figure 6 shows a comparison of the normalized throughput of the CMJA algorithm and the independent Q-learning algorithm.The throughput with the independent Q-learning jamming algorithm gradually decreases over time,eventually stabilizing around 30%.This is because there is no cooperative association among the jammers,each of which selects channels independently.Two jammers can take the same action in one time slot,and thus some users can communicate normally.The CMJA algorithm considers the coordination between the jammers and makes optimal decisions that can successfully jam on two user channels simultaneously.Thus,the jammers gradually find the optimal jamming policy and the throughput gradually decreases and eventually converges,fluctuating around 5%.

    5.2 The Simulation Results for Mode II

    Figure 8.The change in the user’s throughput in mode II.

    The jamming success probability curve for the users when taking mode II to communicate is shown in Figure 7.In this case,the users will probabilistically select the communication channel instead of using a fixed channel switching policy.It can be seen that the CMJA algorithm can be used for learning the frequency usage policy of the users to jam the communication channels with a certain probability.In contrast,when jammers adopt the independent Q-learning algorithm,the success rate of jamming is low due to the uncertainty of the users’channel switching and the independence between the jammers.With a user channel transition probability of 70%,the jammers executing the proposed CMJA algorithm can successfully jam the user data with a probability of 70%.

    Figure 8 shows the variation of the normalized throughput due to the usage of the CMJA algorithm and the independent Q-learning algorithm.The normalized throughput can be maintained at a high level when the jammers adopt the independent Q-learning algorithm because of the low success rate of jamming.Statistically,around 60% of the data is able to be transmitted properly and 40% of the user data is successfully jammed.When the jammers execute the proposed CMJA algorithm,the user normalized throughput fluctuates around 35% during the convergence phase and around 65% of the user data is successfully jammed.Compared to the independent Qlearning algorithm,the normalized throughput of the users drop by approximately 25%.

    The reason for the large fluctuation of the curve in Figure 7 and Figure 8 is that the channel switching of the user is uncertain.When counting after every 20 time slots,there is an uncertainty in the number of times the channel is selected to reside.In addition,when the users choose to reside in the current channel in the next time slot,the jammers tend to select the next channel with a larger Q-value,which causes a decision error at this point and thus the curve exhibits some fluctuation.

    Based on the analysis described above,we can learn that the proposed CMJA algorithm exhibits superior performance compared to the independent Q-learning algorithm.The latter does not consider the coordination between jammers and each jammer selects its channel independently.Thus,different jammers can make the same decision,which results in a waste of spectrum resources.However,in the proposed CMJA algorithm,not only are the actions of the users learned but the coordination between the jammers are also considered.Thus,the jamming effect is better than that achieved by the independent Q-learning method.

    VI.SDR-BASED TESTS FOR THE PROPOSED INTELLIGENT JAMMING SYSTEM

    This section describes a multi-agent jamming system built using a software radio platform,which is based on a Linux system and C++development software for the overall system design.The system uses NI USRP2920 and B210 devices as the hardware platform.The composition of the system is shown in the Figure 9.In terms of the functional composition,the system contains three subsystems:an intelligent jamming subsystem,a wireless transmission and communication subsystem,and a confrontation visualization subsystem.The intelligent jamming subsystem contains two submodules,namely,spectrum-sensing submodule and intelligent decision-making submodule,and the two submodules coordinate to drive the implementation of the proposed CMJA algorithm.

    The wireless communication subsystem serves as the companion system for verifying of the algorithm,completing the transmission and reception of user data.The system consists of four USRP B210 devices and a PC terminal,which are connected via a switch and a gigabit network port.The communication frequency parameters and the channel switching policy can be set in the PC terminal.

    The confrontation visualization subsystem consists of a PC terminal,which realizes interface operation and displays using the developed user terminal program.The system can display the received spectrum waveform and analyze the number of data and ACK volumes normally transmitted by the communication in real-time.

    Figure 9.The composition of the system.

    For the platform system presented in this section,our main contributions are the spectrum sensingprocess and the intelligent decision-making process.

    6.1 Design of the Intelligent Jamming System

    The spectrum-sensing subsystem contains a USRP2920 device and a PC terminal,which are connected via gigabit Ethernet.The USRP device is driven by USRP universal hardware driver(UHD)to receive the users’signals and the digital signals are processed in the PC terminals.The spectrumsensing system can obtain the spectrum state in real-time by using a wideband fast spectrum sensing technology[40].

    The intelligent decision-making subsystem consists of two USRP2920 devices and a PC terminal,and this system is based on the proposed CMJA algorithm for selecting the jamming channels.The system uses the user frequency data sent by the spectrum-sensing module as the current statest=(u1(t),u2(t)),and makes the collaborative jamming channels a=(a1,a2)by searching the collaborative Q table.The system works by online learning,during which the Q-table of each jammer and the collaborative Q-table are updated.

    The steps for jammers to release the jamming signals are as follows:

    Step 1:The independent Q-table,collaborative Qtable,and reward matrix are initialized.

    Step 2:Interaction with the spectrum-sensing system is carried out for obtaining the current communication frequency.

    Step 3:Based on the current users’channels,i.e.,the current state,the jammers decide on the communication channel to be jammed in the next time slot on the basis of the collaborative Q-value table.

    Step 4:The relevant jamming parameters are configured based on the sensed user information,such as the jamming duration,the jamming frequency interval,the transmission power of the jamming signals,and the jamming gain.

    Step 5:The USRP devices are driven to send the jamming signals for accurate jamming through the UHD driver configuration.

    Step 6:Step 2 to Step 5are repeated to realize intelligent decision-making.

    6.2 Testing and Verification of the Multi-Agent Jamming System

    Based on the introduction of the multi-agent jamming system described above,the demonstration verification system built in this work is shown in Figure 10.

    This section presents the procedure and the results of the test conducted for verifying the effectiveness of the proposed algorithm.The actual jamming effect are tested by using the communication system that is built as a companion.To accurately evaluate the jamming performance of the algorithm,we define the normalized throughput of the users as the evaluation index of the jamming effect:

    Figure 10.The demonstration verification system.

    Figure 11.The initial spectrum.

    wherePackdenotes the number of packets correctly received at a certain time during the actual transmission andPalldenotes all the packets transmitted during the actual transmission.

    In the test,two scenarios are considered for verifying the intelligence and effectiveness of the built multi-agent jamming system by combining the two communication modes mentioned above.The actual communication frequency range for the users is 834-852MHZ,with a frequency interval of 2MHZfor a total of 10 communication channels.The length of each communication time slot is 1s.

    The display of the confrontation visualization subsystem is shown in Figure 11 and the figure shows the spectrum waveform of the jamming signal released at the beginning.The orange and the yellow boxes represent the spectrum waveform of the communication signal and jamming signal,respectively,whereas the pink box below represents the user’s normalized throughput,which is used for evaluating the jamming effect.The transmission success rate reaches 100%when the initial jamming signal is not sent and drops when it is jammed.The details of the system test under the two scenarios are given below.

    Scenario I:Users use fixed sequence frequency hopping(mode I)to communicate.

    To make the test results intuitive and clear,we calculate the normalized throughput after every 5 communication time slots and the results thus obtained are shown in Figure 12(a).To quantitatively analyze the jamming effect of the proposed algorithm,we record and plot the normalized throughput in Figure 12(b).

    Figure 12.The test results in scenario I.

    Figure 13.The test results in scenario II.

    At the beginning of the algorithm execution,the jammers do not learn the user’s communication pattern and release the jamming signals blindly and irregularly.Consequently,the jamming effect is not obvious and the normalized throughput is maintained at a high level.The normalized throughput of the users gradually decreases as the jammers continue learning and iterating,and finally stabilize at a low level as the algorithm converges.From Figure 12(b),it can be seen that the users’normalized throughput is reduced to approximately 8%.The jamming spectrum can precisely suppress the communication spectrum at this time.The throughput curve in the figure illustrates the entire online learning process of the jamming algorithm.The actual platform test results are consistent with the simulation results,which prove the effectiveness of the proposed CMJA algorithm.

    Scenario II:Users use probabilistic frequency hopping(mode II)to communicate.The users select to reside in the current channel with a probability of 30%and select the next channel with a probability of 70%.

    We calculate the normalized throughput after every 10 communication time slots.The results thus obtained are shown in Figure 13(a)and the recorded data is plotted in Figure 13(b).

    The proposed CMJA algorithm is not completely accurate in suppressing the user channels with a probability of 100%when the communication users use the probabilistic frequency hopping mode.This is due to the random nature of the changes in the behavior of the communication users’channels.Based on the current state,the jammers select the actions with the largest Q-value in the collaborative Q-table during the convergence phase of the algorithm.When the users choose to reside in the current channel,the jammers select the next channel with a larger Q-value,which results in a decision error.

    As shown in Figure 13(b),the jammers can successfully jam the packets of users with a probability of 55%.Unlike the simulation results,the assumptions in the simulation are ideal,whereas in an actual wireless communication environment,multipath effects and transmission delays exist.This results in a lower jamming success rate in the actual communication test as compared to the simulation results.

    6.3 Applications and Perspectives

    Traditional communication jamming techniques and equipments have significant capability shortcomings when countering networked systems.The distributed cluster system has strong jamming capability in time domain,frequency domain and spatial domain,and cognitive electronic countermeasures can adopt collaborative jamming technology of multiple jammers,which can be optimized in time domain,frequency domain and spatial domain simultaneously[41,42].The distributed cluster system can integrate multiple distributed jamming subsystems to realize the sharing of spectrum resources and complete the collaborative jamming to the target network system.Therefore,the collaborative jamming algorithm proposed in this paper has a good application prospect in distributed cluster system,which is also one of the effective means to jam the networked communication system.

    In addition,the software-defined radio has strong flexibility,allowing new functions to be added by adding software modules.And with strong openness,its hardware can be updated or extended with the development of devices and technology.The SDR technology is currently developing in the direction of miniaturisation,integration and intelligence,and has good prospects in the field of communication countermeasures.For example,the SDR architecture can be deployed in UAV electronic jamming systems,and the UAV electronic jamming system require multiple jammers to work together in a coordinated manner to complete the jamming task.Therefore the proposed multiagent jamming scheme can be used in UAV group collaborative jamming in the future.

    VII.CONCLUSION

    In this paper,we investigated the problem of channel selection for multiple intelligent jammers in a multiuser scenario.Firstly,we introduced the multi-agent MDP framework to model and analyze the multi-agent jamming problem.Secondly,a collaborative jamming algorithm based on multi-agent reinforcement learning was proposed,and simulation results showed that the CMJA algorithm outperforms the independent Qlearning algorithm.Finally,verification of the effectiveness of the proposed CMJA algorithm was performed based on the SDR platform.Results from the verification tests showed that the proposed CMJA algorithm can effectively jam multi-user communications through“distributed calculation and collaborative decision”.

    It should be pointed out that the proposed CMJA algorithm is based on the Q-learning algorithm,which belongs to the table search-based reinforcement learning method and is unable to solve high-dimensional decision problems in the large-scale scenario.Recently,the mean-field learning method has been widely studied,which may be a feasible way to address the shortcomings of Q-learning in large-scale scenarios.In future work,we consider to model the multi-agent decision making process as a Markov game and solve it using a mean-field multi-agent reinforcement learning algorithm,aiming to realize the fast decision making under large-scale communication confrontation scenarios.

    ACKNOWLEDGEMENT

    This work was supported by National Natural Science Foundation of China(No.62071488 and No.62061013).

    一级,二级,三级黄色视频| 亚洲精品久久久久久婷婷小说| 在现免费观看毛片| 观看美女的网站| 欧美另类一区| 国产亚洲欧美精品永久| 男女高潮啪啪啪动态图| 黑人猛操日本美女一级片| 久久热在线av| 欧美日韩视频高清一区二区三区二| 国产日韩一区二区三区精品不卡| 日本猛色少妇xxxxx猛交久久| 午夜免费男女啪啪视频观看| 高清不卡的av网站| 国产精品亚洲av一区麻豆 | 女人被躁到高潮嗷嗷叫费观| 国产在视频线精品| 在线观看三级黄色| 午夜av观看不卡| 天堂中文最新版在线下载| 久久久久久久亚洲中文字幕| 日日摸夜夜添夜夜爱| av在线观看视频网站免费| 永久网站在线| 色网站视频免费| 国产精品国产av在线观看| 国产精品久久久av美女十八| 久久久久久人妻| 天天躁夜夜躁狠狠躁躁| 久久久久国产精品人妻一区二区| 天堂8中文在线网| 一边亲一边摸免费视频| 国产一区二区在线观看av| 国产精品国产三级专区第一集| 美女脱内裤让男人舔精品视频| 亚洲,欧美精品.| 欧美av亚洲av综合av国产av | 日本av手机在线免费观看| 免费观看无遮挡的男女| 免费观看性生交大片5| 久久综合国产亚洲精品| 精品久久久久久电影网| 在线观看免费日韩欧美大片| 久久女婷五月综合色啪小说| 久久亚洲国产成人精品v| 午夜影院在线不卡| 成人二区视频| 国产片特级美女逼逼视频| av国产精品久久久久影院| 在线观看国产h片| 欧美日韩亚洲国产一区二区在线观看 | 国产精品国产av在线观看| 热re99久久国产66热| 免费久久久久久久精品成人欧美视频| 精品国产一区二区三区久久久樱花| 美国免费a级毛片| 久久久久国产一级毛片高清牌| 日本vs欧美在线观看视频| 亚洲国产av影院在线观看| 制服诱惑二区| 黄片无遮挡物在线观看| 国产精品成人在线| 国产熟女午夜一区二区三区| 国产精品成人在线| 欧美精品国产亚洲| 欧美精品人与动牲交sv欧美| 婷婷色麻豆天堂久久| av免费观看日本| 欧美xxⅹ黑人| 久久人妻熟女aⅴ| 色94色欧美一区二区| 亚洲视频免费观看视频| 国产精品麻豆人妻色哟哟久久| 三上悠亚av全集在线观看| 日韩一区二区三区影片| 2021少妇久久久久久久久久久| 日日摸夜夜添夜夜爱| 美女xxoo啪啪120秒动态图| 日韩av免费高清视频| 日韩三级伦理在线观看| 26uuu在线亚洲综合色| 国产黄色免费在线视频| 一级片免费观看大全| 超色免费av| 精品福利永久在线观看| 母亲3免费完整高清在线观看 | 久久免费观看电影| 欧美xxⅹ黑人| 母亲3免费完整高清在线观看 | 精品久久蜜臀av无| 亚洲成国产人片在线观看| 女人精品久久久久毛片| 亚洲欧洲国产日韩| 亚洲国产av新网站| 久久久久精品久久久久真实原创| 久久精品国产自在天天线| 国产免费福利视频在线观看| 久久婷婷青草| 亚洲视频免费观看视频| 亚洲精品一二三| 欧美老熟妇乱子伦牲交| 尾随美女入室| 99国产精品免费福利视频| 在线天堂中文资源库| 欧美最新免费一区二区三区| 黑丝袜美女国产一区| av.在线天堂| 91午夜精品亚洲一区二区三区| 精品福利永久在线观看| 亚洲av国产av综合av卡| 亚洲精品日韩在线中文字幕| 成人国产麻豆网| 亚洲成国产人片在线观看| av一本久久久久| 精品久久蜜臀av无| 校园人妻丝袜中文字幕| 中文字幕av电影在线播放| 一二三四中文在线观看免费高清| 色播在线永久视频| 我要看黄色一级片免费的| 97精品久久久久久久久久精品| 亚洲av综合色区一区| 欧美成人精品欧美一级黄| 久久久久久久大尺度免费视频| www.自偷自拍.com| 亚洲美女搞黄在线观看| 欧美av亚洲av综合av国产av | 一区二区三区激情视频| 一边摸一边做爽爽视频免费| 在线亚洲精品国产二区图片欧美| 精品人妻在线不人妻| 18+在线观看网站| 亚洲久久久国产精品| 中文字幕人妻丝袜一区二区 | 秋霞伦理黄片| 在线精品无人区一区二区三| 国产一区二区激情短视频 | av网站免费在线观看视频| 1024视频免费在线观看| 久久99蜜桃精品久久| 日韩中字成人| 少妇人妻久久综合中文| 成人国语在线视频| 国产激情久久老熟女| 国产一区二区 视频在线| 啦啦啦视频在线资源免费观看| 美国免费a级毛片| 少妇 在线观看| 18禁动态无遮挡网站| 精品亚洲乱码少妇综合久久| 蜜桃国产av成人99| 丝袜在线中文字幕| 精品亚洲乱码少妇综合久久| 免费黄频网站在线观看国产| 精品一品国产午夜福利视频| 亚洲精品美女久久av网站| 在线天堂中文资源库| 夫妻性生交免费视频一级片| 精品国产乱码久久久久久男人| 五月伊人婷婷丁香| 亚洲精品,欧美精品| 91午夜精品亚洲一区二区三区| 亚洲精品久久久久久婷婷小说| 只有这里有精品99| 国产麻豆69| 观看美女的网站| 欧美精品亚洲一区二区| 免费看av在线观看网站| 极品少妇高潮喷水抽搐| 一区二区日韩欧美中文字幕| 伦精品一区二区三区| 久久久久久久久久人人人人人人| 高清黄色对白视频在线免费看| 免费高清在线观看视频在线观看| 色婷婷av一区二区三区视频| 青春草亚洲视频在线观看| 丰满少妇做爰视频| 在现免费观看毛片| a 毛片基地| 亚洲国产欧美日韩在线播放| 国产亚洲精品第一综合不卡| 一区二区日韩欧美中文字幕| 久久精品人人爽人人爽视色| 日韩人妻精品一区2区三区| 又大又黄又爽视频免费| 欧美另类一区| 人人澡人人妻人| 久久久精品94久久精品| 一级毛片电影观看| 国产免费视频播放在线视频| 一边摸一边做爽爽视频免费| 精品亚洲乱码少妇综合久久| 1024香蕉在线观看| 久久久国产精品麻豆| 人成视频在线观看免费观看| 日本色播在线视频| 自拍欧美九色日韩亚洲蝌蚪91| 在线观看免费高清a一片| av在线app专区| 亚洲五月色婷婷综合| 侵犯人妻中文字幕一二三四区| 在线亚洲精品国产二区图片欧美| 日本欧美视频一区| 熟女av电影| 久久精品人人爽人人爽视色| 久久人妻熟女aⅴ| 亚洲精品久久久久久婷婷小说| 日本午夜av视频| 成人国语在线视频| 久久影院123| 亚洲婷婷狠狠爱综合网| 春色校园在线视频观看| 人妻系列 视频| 精品99又大又爽又粗少妇毛片| 老女人水多毛片| 两个人免费观看高清视频| 国产成人一区二区在线| 女性生殖器流出的白浆| 大话2 男鬼变身卡| 91国产中文字幕| 99热网站在线观看| 男女边吃奶边做爰视频| 人人澡人人妻人| 欧美日韩国产mv在线观看视频| 777久久人妻少妇嫩草av网站| 亚洲成人一二三区av| 精品亚洲成a人片在线观看| 久久精品aⅴ一区二区三区四区 | 国产白丝娇喘喷水9色精品| 久久久亚洲精品成人影院| 亚洲天堂av无毛| 一级毛片电影观看| 美国免费a级毛片| 久久人人爽人人片av| 久久国产精品男人的天堂亚洲| 久久女婷五月综合色啪小说| 26uuu在线亚洲综合色| 成人18禁高潮啪啪吃奶动态图| 亚洲精品国产av成人精品| 老熟女久久久| 久久精品国产亚洲av涩爱| 这个男人来自地球电影免费观看 | 日韩精品有码人妻一区| 久久久久人妻精品一区果冻| 一二三四中文在线观看免费高清| 国产精品99久久99久久久不卡 | 亚洲国产欧美日韩在线播放| 女人高潮潮喷娇喘18禁视频| 亚洲精品一二三| 天堂中文最新版在线下载| 视频区图区小说| 天堂8中文在线网| 黄色视频在线播放观看不卡| 久久久久久人人人人人| 女人被躁到高潮嗷嗷叫费观| 国产免费又黄又爽又色| 天天躁狠狠躁夜夜躁狠狠躁| 免费大片黄手机在线观看| 国产精品无大码| 国产片内射在线| 夜夜骑夜夜射夜夜干| 欧美成人精品欧美一级黄| 亚洲男人天堂网一区| 亚洲色图 男人天堂 中文字幕| 黑人猛操日本美女一级片| 人妻人人澡人人爽人人| 国产精品久久久久久久久免| 九草在线视频观看| 婷婷色麻豆天堂久久| 一边亲一边摸免费视频| 久久国产亚洲av麻豆专区| 亚洲精品第二区| 国产日韩欧美在线精品| 久久国产精品大桥未久av| 菩萨蛮人人尽说江南好唐韦庄| 校园人妻丝袜中文字幕| 曰老女人黄片| 国产日韩欧美亚洲二区| 久久狼人影院| 国产毛片在线视频| 寂寞人妻少妇视频99o| 亚洲欧洲国产日韩| 久久99热这里只频精品6学生| 天堂俺去俺来也www色官网| 国产色婷婷99| 精品国产一区二区三区久久久樱花| 成人黄色视频免费在线看| 久久热在线av| 黄色 视频免费看| 亚洲精品美女久久av网站| 两性夫妻黄色片| 国产日韩一区二区三区精品不卡| 狂野欧美激情性bbbbbb| 久久婷婷青草| 免费观看在线日韩| 少妇人妻久久综合中文| 久久国产亚洲av麻豆专区| 国产精品久久久久久精品古装| 亚洲欧美成人精品一区二区| 老鸭窝网址在线观看| 精品午夜福利在线看| 咕卡用的链子| 丝瓜视频免费看黄片| 深夜精品福利| 久久99蜜桃精品久久| 成人影院久久| 最近的中文字幕免费完整| 精品一品国产午夜福利视频| 精品人妻偷拍中文字幕| 国产精品欧美亚洲77777| 不卡av一区二区三区| 国产免费又黄又爽又色| 国产精品一区二区在线观看99| 久久99一区二区三区| 久久久精品国产亚洲av高清涩受| 色播在线永久视频| 新久久久久国产一级毛片| 午夜福利在线免费观看网站| 人人妻人人澡人人看| av在线播放精品| 女的被弄到高潮叫床怎么办| 99re6热这里在线精品视频| 观看av在线不卡| 日韩成人av中文字幕在线观看| 麻豆乱淫一区二区| 少妇被粗大的猛进出69影院| 日韩一本色道免费dvd| 激情五月婷婷亚洲| 久久精品夜色国产| 亚洲精品久久成人aⅴ小说| 国产精品.久久久| 亚洲色图 男人天堂 中文字幕| 有码 亚洲区| 99久久精品国产国产毛片| 老汉色∧v一级毛片| 美女中出高潮动态图| 欧美日本中文国产一区发布| 一级片免费观看大全| 大码成人一级视频| 热99国产精品久久久久久7| 日韩视频在线欧美| 午夜福利网站1000一区二区三区| 自拍欧美九色日韩亚洲蝌蚪91| 亚洲av免费高清在线观看| 巨乳人妻的诱惑在线观看| 视频在线观看一区二区三区| 最近中文字幕高清免费大全6| 18禁观看日本| 久久久久国产网址| 日韩精品有码人妻一区| 嫩草影院入口| 韩国精品一区二区三区| 久久久久久久久久久免费av| 国产精品秋霞免费鲁丝片| 久久韩国三级中文字幕| 韩国高清视频一区二区三区| 免费大片黄手机在线观看| 亚洲美女视频黄频| 亚洲国产欧美网| 在线观看人妻少妇| 午夜福利网站1000一区二区三区| 午夜激情久久久久久久| 国产精品99久久99久久久不卡 | 国产精品久久久久久av不卡| 国产精品一二三区在线看| 亚洲在久久综合| 国产97色在线日韩免费| av电影中文网址| 丁香六月天网| 国产免费现黄频在线看| 国产精品一国产av| 制服人妻中文乱码| 纯流量卡能插随身wifi吗| 国产成人精品在线电影| 亚洲综合色惰| 王馨瑶露胸无遮挡在线观看| 国产精品蜜桃在线观看| 人体艺术视频欧美日本| 91久久精品国产一区二区三区| 性色av一级| 久久久久久久久久人人人人人人| 一区二区三区精品91| av一本久久久久| 热re99久久精品国产66热6| 久久精品国产亚洲av涩爱| 国产黄频视频在线观看| 亚洲欧洲日产国产| 我要看黄色一级片免费的| 亚洲美女搞黄在线观看| 国产高清不卡午夜福利| 国产成人91sexporn| 亚洲欧美日韩另类电影网站| 亚洲久久久国产精品| 久久这里只有精品19| 欧美 亚洲 国产 日韩一| 亚洲人成网站在线观看播放| 最新中文字幕久久久久| 2021少妇久久久久久久久久久| 亚洲,欧美精品.| 国产成人免费观看mmmm| 国产一级毛片在线| 亚洲精品美女久久av网站| 美女国产视频在线观看| 人妻少妇偷人精品九色| 国产伦理片在线播放av一区| 国产国语露脸激情在线看| 春色校园在线视频观看| 成人午夜精彩视频在线观看| 久久99蜜桃精品久久| av在线播放精品| 新久久久久国产一级毛片| 久久99一区二区三区| 国产精品一区二区在线不卡| 建设人人有责人人尽责人人享有的| 午夜福利,免费看| 免费播放大片免费观看视频在线观看| 欧美日韩国产mv在线观看视频| 国产日韩欧美在线精品| 欧美精品一区二区大全| 一级毛片电影观看| 秋霞在线观看毛片| 成年人免费黄色播放视频| 一级毛片 在线播放| 久久精品夜色国产| 国产白丝娇喘喷水9色精品| 有码 亚洲区| 亚洲av综合色区一区| 久久99一区二区三区| 人人妻人人添人人爽欧美一区卜| 国产乱人偷精品视频| 久久免费观看电影| 日韩成人av中文字幕在线观看| 天天影视国产精品| 只有这里有精品99| 日韩欧美精品免费久久| 欧美老熟妇乱子伦牲交| 国产麻豆69| 最近的中文字幕免费完整| 女人久久www免费人成看片| 精品久久蜜臀av无| 欧美日韩精品网址| 亚洲精品美女久久av网站| 自拍欧美九色日韩亚洲蝌蚪91| 国产亚洲一区二区精品| 叶爱在线成人免费视频播放| 久久热在线av| 丁香六月天网| 91aial.com中文字幕在线观看| av又黄又爽大尺度在线免费看| 看非洲黑人一级黄片| 亚洲精品日本国产第一区| 99re6热这里在线精品视频| 久久久久视频综合| 久久久久久免费高清国产稀缺| 最近中文字幕2019免费版| 精品一区在线观看国产| 老熟女久久久| 这个男人来自地球电影免费观看 | 一级片'在线观看视频| 韩国精品一区二区三区| av在线观看视频网站免费| 人妻少妇偷人精品九色| 国产欧美日韩一区二区三区在线| 汤姆久久久久久久影院中文字幕| a级毛片黄视频| 国产成人精品福利久久| 丰满迷人的少妇在线观看| 国产免费又黄又爽又色| 99热网站在线观看| 999久久久国产精品视频| 国产黄频视频在线观看| 国产 精品1| 寂寞人妻少妇视频99o| 一区二区三区四区激情视频| 青春草视频在线免费观看| 亚洲欧美成人综合另类久久久| 免费黄色在线免费观看| 老司机影院毛片| 亚洲av电影在线观看一区二区三区| 精品亚洲成a人片在线观看| 国产黄频视频在线观看| 欧美国产精品va在线观看不卡| 亚洲精品美女久久av网站| 午夜激情久久久久久久| 丝袜在线中文字幕| 亚洲精品aⅴ在线观看| 美女福利国产在线| 欧美变态另类bdsm刘玥| 日本猛色少妇xxxxx猛交久久| 久久久久久久亚洲中文字幕| 视频区图区小说| 国产无遮挡羞羞视频在线观看| 肉色欧美久久久久久久蜜桃| 日韩成人av中文字幕在线观看| 欧美av亚洲av综合av国产av | 亚洲一码二码三码区别大吗| 久久国产亚洲av麻豆专区| 久久久欧美国产精品| 日韩大片免费观看网站| 999精品在线视频| 电影成人av| 99久国产av精品国产电影| 久久久久久久久久久免费av| 成人18禁高潮啪啪吃奶动态图| 亚洲欧美色中文字幕在线| 成人国产麻豆网| 精品国产超薄肉色丝袜足j| 伊人亚洲综合成人网| 一本久久精品| 大香蕉久久网| 在线观看美女被高潮喷水网站| 国产成人精品一,二区| 最新中文字幕久久久久| 国产亚洲一区二区精品| 边亲边吃奶的免费视频| 人妻人人澡人人爽人人| 精品酒店卫生间| 亚洲情色 制服丝袜| 一个人免费看片子| 精品久久久久久电影网| 国产成人欧美| 超色免费av| 国产精品免费视频内射| 国产福利在线免费观看视频| 咕卡用的链子| 女人久久www免费人成看片| 黄片播放在线免费| 18+在线观看网站| 欧美日韩亚洲国产一区二区在线观看 | 国产精品 国内视频| 免费在线观看视频国产中文字幕亚洲 | 热re99久久精品国产66热6| 国产成人精品无人区| 国产精品久久久久成人av| 黑丝袜美女国产一区| 亚洲精品久久午夜乱码| 国产毛片在线视频| 国产人伦9x9x在线观看 | av片东京热男人的天堂| 男人舔女人的私密视频| 欧美人与性动交α欧美精品济南到 | av不卡在线播放| 国产有黄有色有爽视频| 99九九在线精品视频| 亚洲经典国产精华液单| 91精品国产国语对白视频| 欧美老熟妇乱子伦牲交| 免费av中文字幕在线| 精品一区二区三卡| 国产精品人妻久久久影院| 女人高潮潮喷娇喘18禁视频| 亚洲国产av影院在线观看| www.自偷自拍.com| 久久精品国产自在天天线| 超碰97精品在线观看| 街头女战士在线观看网站| 国产乱来视频区| 男女边摸边吃奶| 国产成人aa在线观看| 边亲边吃奶的免费视频| 成年人午夜在线观看视频| 亚洲av日韩在线播放| 五月天丁香电影| 亚洲一码二码三码区别大吗| 亚洲四区av| 久久精品国产亚洲av高清一级| 国产精品熟女久久久久浪| 中文字幕人妻丝袜制服| 丰满少妇做爰视频| 香蕉丝袜av| 亚洲四区av| 久久99一区二区三区| www日本在线高清视频| 永久网站在线| av不卡在线播放| 久久精品国产亚洲av高清一级| 国产精品熟女久久久久浪| 免费女性裸体啪啪无遮挡网站| 性色avwww在线观看| 亚洲成人av在线免费| 日韩欧美精品免费久久| 肉色欧美久久久久久久蜜桃| 亚洲精品国产av成人精品| 精品国产乱码久久久久久男人| 一区二区三区乱码不卡18| 老司机影院毛片| 一区二区三区乱码不卡18| 在线天堂最新版资源| 一本大道久久a久久精品| 中文字幕另类日韩欧美亚洲嫩草| 一本大道久久a久久精品| 永久网站在线| a级毛片黄视频| a级毛片在线看网站| 校园人妻丝袜中文字幕| 另类亚洲欧美激情| 欧美精品av麻豆av| 亚洲国产精品999| 亚洲情色 制服丝袜| 在线精品无人区一区二区三| 午夜免费鲁丝| www.av在线官网国产| 啦啦啦视频在线资源免费观看| 国产一区二区激情短视频 | 一二三四中文在线观看免费高清| 国产在线免费精品| 热re99久久国产66热| 人体艺术视频欧美日本| 婷婷色综合www| 亚洲国产精品一区二区三区在线| 欧美国产精品va在线观看不卡| 男女下面插进去视频免费观看| 黄色视频在线播放观看不卡| 麻豆av在线久日| 亚洲五月色婷婷综合|