• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Energy Estimation and Optimization Platform for 4G and the Future Base Station System Early-Stage Design

    2017-05-08 13:18:51WeiWangDakeLiuYingZhangChenGong
    China Communications 2017年4期

    Wei Wang, Dake Liu, Ying Zhang, Chen Gong

    Lab of Application Specific Instruction-set Processors, Beijing Institute of Technology, Beijing 100081, Beijing, China

    * The corresponding author,email: dake@bit.edu.cn

    I. INTRODUCTION

    Energy savings have been a driver for standardization of cellular networks like LTE-Advance [1] and the 5G mobile communication network [2]. Base station energy/power consumption modeling [3-9] at system level is an essential research. However, most of existing power/energy models have not yet provided instrumental energy/power estimation for system designers in early design stages. This is because existing models greatly depend on detailed physical information of commercial-offthe-shelf (COTS) components. Such COTS information may be unavailable or insufficient in early design stages.

    On the one hand, in [5,8,9], GIPS-based(giga instructions per second) method is proposed for power estimation of digital baseband subsystem (DBB). The detailed physical information here refers to the power to performance ratio of an available processor and the number of instructions involved in each physical layer algorithm. This physical information is used for power estimation of each algorithm. However, if there is no available processor for new system design in early stages, GIPS-based method will not be feasible due to the lack of the physical information. The as-yet unstan-dardized 5G is such an example.

    In this paper, a holistic power/energy estimation model is proposed for base station system early-stage design.

    On the other hand, in [3-6], measured power values of all analog subsystems except PA,are simply used for power estimation. However, base station design is usually highly customizable. The measured values may not be suitable in other cases. Generic analog component power models are nevertheless necessary in system design for power budget, to select suitable components and to identify low-power design opportunities in early design stages.

    Therefore, this paper motivates the needs for modeling energy and power consumption of new generation base station designs with insufficient physical information. A cross-platform DBB energy model is proposed for energy estimation of new system design, which is strongly inspired by block reuse [10]. With the physical information from reuse blocks,energy consumption of other new design blocks in target system can be estimated. Our fine-grain energy model focuses not only on energy estimation, but also on evaluating and optimizing energy/power consumption of candidate system designs in early design stages.Consequently, an energy optimization method based on the DBB energy model is introduced to explore optimal DBB design alternatives for energy/power minimization. Meanwhile,through collecting power figures from hundreds of published papers, statistical power models of analog subsystems are also presented to capture the variation trend of existing radio frequency (RF), analog-to-digital converter (ADC), and digital-to- analog converter(DAC) designs. In addition, a PA power model is also proposed, which provides more configurable parameters for designers to estimate power consumption of candidate PA designs.

    The rest of this paper is organized as follows. In section II, modeling methodology and basics of digital baseband are reviewed and discussed. The generic baseband energy model and analog subsystem power models are presented in section III. A power/energy optimization method combined with a design space construction method for digital baseband design is introduced in section IV. Section V provides an early power estimation case of digital baseband and power amplifier. Section VI concludes this paper. For easy reading, the main abbreviations in this paper are summarized and listed at the end of the paper.

    II. MODELING METHODOLOGY

    In this section, the DBB energy modeling method and related backgrounds are briefly reviewed.

    2.1 Principle and scope

    We proposed a hierarchical power model for estimating the IC (integrated circuit) power dissipation of MIMO- OFDM baseband in early design stages [11]. By this model, the power estimation of an algorithm is mainly depended on computing costs (e.g., the number of adders, multipliers, logic gates and memories)mapping on a specific microarchitecture. This power model is suitable for both programmable processors and dedicated hardware.

    In this paper, besides the power model, an additional execution time model is also proposed together with the power model as an integrated DBB energy model.

    2.2 Modeling basics

    According to ITRS report [12], the proportion of design block reuse in new design is 35% in 2007, and it will increase to 58% by 2022. More pre-existing blocks (intellectual property-IP) are applying to new designs to reduce design costs and time-to-market. System level DBB energy model in this paper is thus strongly inspired by block reuse. “Many designs (especially microprocessors) are based on modifications or extensions to existing designs, such a method can provide fairly accurate early estimates” [13]. In other words,though it is a new DBB design, it mostly consists of pre-existing blocks. Thus limited yet predictable physical information from reuse blocks can be used to estimate power consumption and execution time of new design blocks. In this paper, a pre-existing or new design block refers to a physical layer module(e.g., FFT module).

    Consequently, these two kinds of blocks are taken into consideration in our DBB model,i.e.,

    --Pre-existing blocks (reuse blocks): This paper focuses on heterogeneous multiprocessor system-on- chip (MPSoC) DBB platform,as shown in Figure 1 [14]. Pre-existing blocks here refer to existing microarchitectures of specified algorithms running on digital signal processor (DSP), application specific integrated circuit (ASIC) or field -programmable gate array (FPGA) platforms. Furthermore, implementation details (i.e., physical information) at low-level (register transfer level–RTL and below, e.g., gate-level) can be reused in early design stages, such as data parallel, task parallel,counts of logic gates/macro blocks, memory cost and data access, interconnect information(e.g. the number of metal layers, area, length,power and delay), and data switching activity;

    --New design blocks: New design blocks can be divided into two categories: modification of an existing microarchitecture (e.g.,parallelization) and new microarchitecture design. For the latter, we model an early microarchitecture design by using its control and data flow graph (CDFG, it is a directed acyclic graph, the vertices of CDFG represent either arithmetic or logic statements of algorithms, or the control statements. Edges model the data/control flow dependencies.), timing diagram,and target platform (DSP/ASIC/FPGA). Due to the lack of implementation details at low level, energy estimation of new design blocks mainly depends on physical information derived fromreuse blocks.

    2.3 Modeling objects and methods

    2.3.1 Objects

    The following offers details when building our DBB energy model for aforementioned two kinds of blocks.

    For every specific physical layer module (e.g., channel estimation module in LTE PHY),

    --Algorithm set: Classic and new algorithms are collected for a module and further described as their CDFGs and timing diagrams. For instance, FIR algorithm set includes two basic members at least, i.e., direct form and linear phase algorithms. This method is exposed to designers to build their own algorithm sets.

    For each algorithm within an algorithm set,

    --Microarchitecture set: According to CDFG and timing diagram of an algorithm, its microarchitecture is selected and/or designed.Data parallel and task parallel structures of a CDFG are explored for designing microarchitecture set.

    For power estimation and execution time estimation of each microarchitecture within a microarchitecture set,

    --Standard cell/Macro block library: For DSP and ASIC platforms, their smallest unit is standard cell and macro block at RTL. The standard cell is a kind of predesigned sub-circuits such as simple logic gates. The macro block is also a kind of predesigned sub-circuits with regular structure (e.g., multipliers, adders and memories) composed from standard cells.The standard cell/macro block library contains all of the available standard cells and macro blocks with detailed physical information. In terms of the proposed DBB energy model, in order to accurately estimate execution time and power consumption of new design blocks,standard cell/macro block library used in reuse blocks has to be referred. On the other hand,the smallest units in a FPGA are programmable logic elements, not standard cells. A logic element usually consists of lookup tables and registers. FPGA functional modules are composed of such logic elements. Hence, for FPGA-based baseband energy estimation, the logic elements are used in the proposed model.

    Fig. 1 Heterogeneous digital baseband MPSoC

    2.3.2 Methods

    Energy estimation for a new design block can be performed in 3 steps:

    Step 1–Microarchitecture: A hardware block is built which is based on an algorithm CDFG of a new design block.

    Step 2–Hardware: The logic gates, functional macro blocks (e.g., multiplier), and memory access within microarchitecture are counted.

    Step 3–Estimation: According to specified physical information extracted from reuse blocks, execution time and power consumption of this microarchitecture can be estimated using the following models.

    III. GENERIC MODELS

    In [11], the power model of a base station has been formulized as follows:

    In this section, integrated DBB energy model and analog subsystem power models,i.e.,EDBB,PRF,PPA,PADC,PDAC, are presented.The power consumption of system level memory buffer (PBuffer), the power supply (PMain Sup-ply), and the backhaul (PBackhaul) largely depend on specific equipment specification.

    3.1 Digital baseband energy models

    Circuit energy consumption is the product of its average power and execution time (runtime). The power model is thus first introduced, followed with execution time model and the final energy model.

    1) The power model

    We restructure the power model of [11] in the view of MPSoC illustrated in Figure 1.Note that the power model still focuses on dynamic power.

    In (2), λcontrolis the overhead ratio of the whole control power including microcontroller(MCU) and part of microarchitecture (e.g.,instruction fetch, instruction decoding, address computing of DSP; finite state machine of ASIC/FPGA).

    Pmodule,mis the power dissipation of the physical layer modulem, such as the power dissipation of a sequence of instructions (DSP)or an application-specific integrated circuit(ASIC/FPGA).Mis the total number of PHY modules at base station side. Each module, executed by processing elements (PEs) in Figure 1, can be further modeled in (3).

    Ntask_parallel,is the number of enabled PEs within a cluster running the modulem.

    f,αmdenote the operating frequency and average data switching activity of all enabled PEs running the modulem, respectively.

    Ndata_parallel,is the degree of data parallelism of a PE, andNunit_typesis the number of different types of basic functional units in a PE.Noper-ands,kis the number of operands ofkth (type)basic functional unit in a PE. For instance,suppose the microarchitecture of FFT symbol processor (Figure 1) is with 4-complex data input (Ndata_parallel=4) and consists of two kinds of basic functional units, i.e., radix-2 butterfly and radix-4 butterfly (Nunit_types=2). 1) In terms of radix-2 (Noperands,1=2), there areNdata_parallel/Noperands,1=4/2=2 radix-2 butterflies in this FFT PE. 2) Likewise, there isNdata_parallel/Noper-ands,2=4/4=1 additional radix-4 butterfly in this PE.

    Uis the number of distinct types of logic gates/macro blocks (in terms of data precision,etc.) withinkth basic functional unit;Pk,i, Nk,iare the calibrated power and the number of theith (i.e., same type) logic gate/macro block(macro blocks including adder, multiplier,etc.).

    Vdenotes the number of memory access times whenkth basic functional unit runs once,Pmemory_read,k,j, Pmemory_write,k,jarethe power dissipation of memory read and write operations respectively. Its power dissipation largely depends on the memory size, memory(bank) width, data type (bit, real data and complex data) and data precision of operands.Nread,k,j, Nwrite,k,jis the iterations ofjth read and write operations, for example, duringkth basic functional unit execution, 2 times memory access happens (V=2), in the first round (j=1),100 bits need to read from local RAM into register file, yet the width of this RAM is 10 bit, as a result this job has to consume 10 read operations, i.e.,Nread,k,1=10,Pmemory_read,k,1is the power consumption of reading 10 bit.

    Pinterconnect,ndenotes the interconnect power dissipation in a single chip;Nchipsis the total number of chips. Referring to ITRS report[15], the power of interconnect wires in the chipncan be estimated by (4), where, αwireis the average interconnection wire activity ratio;ewis the wiring efficiency;Cwireis the average capacitance per unit length (including Metal 1, intermedia wires and global wires);Lis the average metal length of Metal 1 and five intermedia levels per unit area;Vddis supply voltage;fis the chip operating frequency;Ais the area of the chipn.

    Ppin&global_buffer,nin (5) denotes the total power consumption of pins (including full swing and LVDS pins) and non-functional on-chip global buffer (SRAM) of the chipn. This buffer is used to buffer data from other chips or off-chip memory. Its power (Pglobal_buffer,n) is a custom parameter for target system. The pin driver power consumption (ΣPpin(pair)) of a chip depends on the number of pin/pin-pair and power consumption per pin/pin-pair (Ppin(pair)).Ppin(pair)can be estimated by (6), where,Vddis the driver supply voltage andVswingis the pin voltage swing;Ris the termination resistor;Cpinis the load capacitance;fI/Ois the I/O port frequency.

    2) The execution time model

    In (7),Ldownlinkdenotes the total execution time of all PHY modules in downlink (DL).This execution time mainly includes computation delay and memory access delay. It can be approximately estimated by formula (8).

    M1is the total number of PHY modules in DL.x,dtdenotes the workload and execution time of the PEtwithin a cluster running the modulem1;tis the number of enabled PEs in this cluster and satisfiest<= total number of PEs of this cluster. ThusDm1(x)is a set comprising all availabledt. Furthermore, the maximum value among this set is exactly the execution time of the modulem1. Generally,the maximum time depends on workload allocation on different PEs within a cluster.

    Likewise,Luplinkis the execution time of the whole PHY modules in uplink (UL). However, compared to DL, a difference of UL (9) is the coexistence of user-shared modules (m2in part I, the same asLdownlink) and user-specific modules (m3in part II). For instance, in LTE/LTE-A uplink, FFT is a user-shared module processing symbols of the whole transmission bandwidth and all users. Rather, IFFT is user-specific module since its workload is only for partial transmission bandwidth, i.e. a user’s bandwidth.

    M2,M3are the number of user-shared modules and user-specific modules, respectively. It is important to note thatx’in part II is possibly a collected workload of multi-users when the number of users is more than the number of available PEs in a cluster. For example,suppose there are two processors in cluster 2(Figure 1) and three users in uplink; PE 1 may process the data of user 1 and PE 2 may process the data of user 2 and 3, thend’1, d’2is the computing delay of PE 1 and PE 2 with the workloadx’1(user 1),x’2(combination of user 2, 3) respectively.

    Lmemory_access_transferis the average time cost for data access between two chips or between the off-chip main memory and on-chip PE memory transfers e.g., direct memory access(DMA), which is largely depended on specific MPSoC structure design, such as main memory or PE position, bus structure, etc.

    In practice, the execution time of a specific PE can be calculated by (10),Wworkloadis the allocated workload of the PEt;Ndata_parallel,is the degree of data parallelism of PEt;CLPE,tdenotes the clock cycles consumed by PEtforNdata_paralleldata processing;fis clock frequency. Let us take CRC as an example; the original microarchitecture of CRC is linear feedback shift register (LFSR). Since data parallel cannot be applied on LFSR,Ndata_paral-lel=1. In other words, it computes one input bit per clock cycle, i.e.,CLPE,t=1 cycle. Suppose current workload of the PEtis 6144 bit andf=100MHz, then its execution time can be roughly estimated by (10),

    The runtime is obviously too long. Improvements and runtime minimization of CRC are discussed at the step 2 in section 5.1.1 of this paper.

    3) The energy model

    where, part I is the energy dissipation of all clusters and the MCU;di,Pmodule,m,iare the execution time and power consumption of the PEiin clustern(Figure 1) running the PHY modulem; part II, III is the energy consumption of interconnect and non-functional memory access of all chips respectively.

    3.2 Analog component power models

    We propose statistical power models for RF,DAC and ADC [16] by investigating several hundreds of published papers (from 1994 to 2016). These models can provide a clear picture of current situation and future trend of RF, DAC, and ADC power consumption for designers.

    3.2.1 RF

    The RF power model in form is,

    where, δtechnologyis technology scaling factors(With the development of silicon technology,the value of δtechnologyis not a constant. Consequently, both old references and recent papers of RF design cases need to be collected to capture the range of δtechnologyand create the predictable statistical model);Vconstantis DC(direct current) bias voltage;Vddis supply voltage;Vswingis the RF voltage swing. The voltage used for measuring the values of all parameters except θiin the brackets is 1 volt.Pbasedenotes the base power value within 1 MHz bandwidth at 1 GHz, IIP3=10 dBm and sensitivity=-20 dBm (actually, designers can intentionally choose the baseline of bandwidth, frequency, IIP3 (third-order intercept point) and sensitivity);fcis carrier frequency;δbandwidth,δIIP3,δsensitivityare RF parameters of bandwidth, IIP3, and sensitivity; θi,i=0,1,2,3,are corresponding weight factors;gi(·),i=0,1,2,3, are the mapping functions fromfc,δbandwidth,δIIP3and δsensitivitytoPRF;PFSis the power of frequency synthesizer.

    We collected up to 600 published papers and selected more than 200 useful papers for θiregressions without normalization, as illustrated in Figure 2,3,4,5. Regression models can be further created by using the overall data,e.g.,p(1)RF(green line) in Figure 3, and partial data, e.g., bandwidth>100 MHz, is selected to generatep(1)RF(red line) to roughly predict RF power dissipation of wide bandwidth system.Finally, designers should make a decision on reference values offc, δbandwidth,δIIP3, andδsensi-tivityof a target RF design and thus early power estimation can be calculated by using (12).It is important to note that the estimation accuracy of statistical RF circuit power models largely depends on the number of data and the selection of fitting functions. Several kinds of fitting functions were selected in Figure 2,3,4,5 to show a case of building statistical models, and designers should test goodness of fit by using their own selected data to select appropriate fitting functions.

    3.2.2 PA

    In this paper, several impact factors on PA efficiency are investigated to provide more chances for designers to accurately estimate PA power consumption.

    The PA power model is

    Fig. 3 RF statistical bandwidth-energy model

    Fig. 4 Rx RF statistical IIP3-energy model

    Fig. 5 Rx RF statistical sensitivity-energy model

    Fig. 6 DAC statistical power model

    Fig. 7 ADC statistical power model

    where,Poutis the output power radiated at antenna; σfrequency=0.02is the parasitic capacitance loss factor of CMOS and III-V semiconductors FET circuits (fcGHz); σpassive=0.15denotes passive component loss factor;σPPA=0.25(CMOS),σPPA=0.15(III-V semiconductors)is the power factor for pre-power amplifier whenfcis much less thanfmax.σPPAlargely depends on technology such as CMOS, Si Bipolar, GaN, GaAs, LDMOS[17]. σstructureis the circuit structure factor on PA efficiency including back-off [18]. σstruc-tureexposes the theoretical highest efficiency of the circuit for PA, for example, pulse width modulation ( σstructure=0.7), Doherty (σstructure=0.3~0.45) and others ( σstructure=0.1~0.2, e.g.,class A, class B). σfeedis a feeder loss of macro cell, due to the fact that base station sites and the antennas are often situated at different physical locations. In particular, this loss arises from antenna mismatch and the feeder cables. Feeder loss for smaller base stations can be negligible. It is to note that default figures provided are empirical values.

    3.2.3 ADC/DAC

    ADC/DAC is often overlooked due to its relatively small contribution for base station power consumption. Its power, especially the power of DAC, would play more significant roles in the upcoming 5G, when massive MIMO technique [19] is widely adopted. Consequently, in this paper, ADC/DAC power is modeled.

    Similar to RF statistical model, we investigated the impact factors of DAC power dissipation (such as resolution, sampling rate,pin power, etc.) and collected available design cases, products from published papers and OEM (ADI). In Figure 6, distinguishable power distribution can be found (where fsnyq is Nyquist sampling rate, SNDR is signal to noise and distortion ratio); especially the distributed characteristic of ADI products is a useful guideline on product selection. On the other hand, 55mW DAC (I or Q, 12 bit precision, 500 MSPS, no decimation part and no pin cost) may be the best reachable entry point for 5G low power design. Likewise, in [16],the power distribution of ADC has been thoroughly analyzed, as illustrated in Figure 7. It can be seen that 100mW ADC (I or Q, 12 bit precision, 1000 MSPS and pin power) may be the best alternative for 5G.

    There are two ways for ADC/DAC power estimation by using Fig.6 and Fig.7. First,designers can directly estimate the power consumption of new design according to target fsnyq and SNDR provided in two figures. Second, designers need to select the overall data or partial data in two figures to create regression models, and then the power consumption of target design can be predicted.

    IV. OPTIMIZATION FOR MPSoC DESIGN

    In Figure 1, each cluster in DBB MPSoC serves for one or more PHY modules (function) and contains one or more processing elements (hardware). Minimizing the power consumption of DBB MPSoC, according to(2), is equivalent to the minimization of the power consumed by each module. Meanwhile,two major factors impact on the power consumption of each module by (3), i.e., 1) the number of PEs within a cluster (degree of task parallelism) and 2) microarchitecture (all parameters in (3) exceptNtask_paralleldepending on microarchitecture). On the other hand,DBB as a hard real-time system must satisfy a runtime deadline specified by communication standard. Hence, in this section, based on the proposed DBB energy model, a power/energy optimization method under timing constraint is presented. This method selects a suitable microarchitecture and degree of task parallelism for each PHY module in digital baseband to minimize the power/energy consumption of the whole DBB MPSoC. Note that our optimization method mainly focuses on PE design without considering bus structure. Furthermore, this method is especially applicable to the design that more PHY modules than one are implemented in one processor.

    Actually, selecting suitable microarchitecture for each PHY module, to minimize the whole baseband power dissipation, is a case of design space exploration (DSE) in system level design. However, when designing the PE microarchitecture for a module, there are two main differences between conventional DSE and energy/power optimization here.

    1) Optimization goal: The goal of conventional DSE is to minimize the power consumption of a single module. By contrast, our target is to minimize the power consumption of the whole digital baseband.

    2) Constraints: Timing constraint of conventional DSE is on module itself. In our optimization, the timing constraint is on the whole baseband (all modules).

    In other words, for a same module, the suitable PE microarchitecture in our optimization is probably not the optimal one in conventional DSE, even though it (optimal one) contributes to power minimization of the whole digital baseband. Consequently, the traditional optimization algorithms (e.g., the greedy algorithm family: force directed scheduling,list scheduling… and the heuristic algorithm family: ant colony optimization, simulated annealing…[20] (chapter13)) used to explore the optimal operation scheduling may not be able to meet our demand. This is because both the greedy algorithm family and heuristic method family are likely to produce a sub-optimal solution due to the greedy nature and the dependence on probability. As a result, in this paper, a design space construction method and an optimization method based on this design space are proposed to explore the power/energy minimization MPSoC design of digital baseband.

    4.1 Design space construction

    Compared to traditional design space construction, our design space contains all of the potential PE microarchitectures (i.e., varied degree of data/task parallelism and pipeline)for each module. As illustrated in Figure 8,all of the candidate microarchitectures of the overall PHY modules need to be mapped to design space.

    In fact, PE design space here is a weighted directed graphG = , whereV = {V11,V12…}is a set of vertices together with a set of edgesE = {E0,11, E0,12…}, e.g.,E0,11=represents a directed edge fromV0toV11, and a set of weightsW = {W0,11,W0,12…}e.g.,W0,11=(V0,V11)=(E0,11)represents the weight of the edgeE0,11and its value here is execution time ofV11.

    Definition 1. In vertex setV,Vi,j?Vdenotes the candidate PE(s)jwith specific microarchitecture for the physical layer modulei. All of vertices of the moduleiare divided into different subsets by two basic indices: algorithm selection (red circle in Figure 8) and platform selection (green circle). Note thatV0just represents starting point.

    A vertex in the PE design space may be a single PE or a group of PEs (a cluster, i.e.,task level parallel), as shown in the blue circle in Figure 8. Since only one vertex for each module can be selected in exploration process,multi-PEs within a cluster have to be gathered together as a single vertex. In particular, if there is no constraint on the number of PEs within a cluster, we can try to select different combinations of PEs. In addition, every vertex in the design space is already bound with an algorithm and a microarchitecture. If several PHY modules share a same cluster, e.g., Bit processor in Figure 1 support both CRC and Scrambling modules, then these modules should be separated as a sequence of a single module in terms of PE design exploration.Once the PE microarchitectures of these modules are determined using our optimization method, designers can further explore the maximum resource sharing of these microarchitectures for the shared cluster design.

    Fig. 8 Design space exploration for MPSoC design

    Definition 2. In directed graphG, there is always a directed edgeEij,(i+1)k?Ethat connectsVi,jandVi+1,k, and there is no edge betweenVi,jandVi,k. Meanwhile,Wij,(i+1)k?Wis the weight of edgeEij,(i+1)k, and its value is marked by end vertexVi+1,k(i.e., execution time).

    4.2 Optimization method

    Once PE microarchitecture design space is set up, graph algorithms can be selected to explore suitable option for each PHY module under timing constraints. A PE DSE for selecting suitable PE microarchitectures for all PHY modules is often a multistep decision-making process. In this paper, we thus choose shortest path algorithm (dynamic programming) as a default optimization method based on above design space.

    Now letdG(u),dG(v)represent the shortest path from starting pointV0to vertexVuandVv,respectively, anddG(0)=0. Suppose thatTmax,WLpeakbe respectively timing constraint and peak workload of the whole PHY. Then, the goal of our optimization can be defined to be

    VertexVuis a subset of vertices of the moduleu, and only one vertex in this subset can be selected indG(u). By (10), the execution time of each vertexVi,j?Vin design space can be estimated withWLpeak, and is denoted byWij,(i+1)k?W. For givenWij,(i+1)k?W, we can apply shortest-path algorithms (e.g., Dijkstra,Bellman-Ford) on the weighted directed graph to execute global search and calculate optimal solutions. For optimal solutions, there exist two cases. On the one hand, if there is no solution, then designers need to add new candidate vertices to graphGand recalculate solutions.On the other hand, there exist one or more optimal solutions, in Figure 8, suppose bothd(1)G(n)=(V0, V11, V21, V31, V42…Vn1),d(2)G(n)=(V0,V13, V26, V32,V43…Vn2)meet timing constraintTmaxandd(1)G(n)= d(2)G(n)ord(1)G(n)

    Then, designers need to select the final solution for minimum power/energy consumption (optimization) from the collection of optimal solutions. Based on the tradeoff between execution time and power consumption, the optimal solution of shortest path is the minimum execution time of all PHY modules,but power consumption is probably the max.Actually, the Pareto optimal path of our power optimization problem probably is the longest path which satisfies the timing constraintTmax,e.g., from the second case, the Pareto path isConsequently,designers have to calculate the power consumption of each vertex in pathd(1)G(n), d(2)G(n)by using (3) and compare their power values. Smaller one is the Pareto optimal path for power optimization of the whole DBB MPSoC. Meanwhile, microarchitecture and degree of task parallelism of each PHY module are determined by corresponding vertex in Pareto optimal path (e.g.,d(2)G(n)). Likewise,energy consumption for pathd(1)G(n), d(2)G(n)can be further calculated to identify the Pareto optimal solution for energy minimization.

    Finally, a Pareto optimal MPSoC DBB design with minimum energy/power can be identified. Meanwhile, the goal of microarchitecture selection and degree of task parallelism determination for each PHY module can be achieved as well.

    This optimization method has been applied to our digital baseband design introduced in section 5.1. Actually, towards different optimization goals, designers can select distinct constraints (timing, power, area, and throughput)as weightsW.

    V. A CASE STUDY

    DBB and PA are currently two major power greedy subsystems in typical base stations [3][21]. In this section, as an example, a reference design for digital baseband MPSoC under timing constraints is introduced to expose the details about DBB energy consumption.PA power consumption is also estimated and compared with measured results in [3].

    5.1 Digital Baseband

    5.1.1 Scenario assumptions and time allocation

    As a kind of hard real-time embedded system,digital baseband subsystem in base station must satisfy the time constraints specified by communication standard at maximum load.For instance, even though there are 6 supported bandwidths (1.4MHz~20MHz) available in LTE, DBB designers have to follow corner conditions, i.e., 20MHz. Consequently, extreme scenario for our reference design is listed in Table I.

    Under the scenario, time allocation for each PHY module in UL/DL [23][24] is conducted under UL/DL timing constraint, i.e., 1.5/1.5 ms. Time allocation in our reference design focuses on PHY and part of MAC. We took 1 ms/subframe as practical timing constraint for time allocation of major PHY modules in UL/DL. Extra 0.5 ms is used for data processing of PDCP (packet data convergence protocol),RLC (radio link control), MAC (medium access control) and other PHY modules in DL/UL, e.g., resource element mapping in DL,and burst detection in UL. Note that our reference design mainly focuses on PE design of MPSoC without considering bus structure(Figure 1).

    Time allocation process can be simply stated as follows (suppose the algorithm set and the microarchitecture set for each PHY mod-ule are already constituted; for simplicity, the assumption of block processing is made; clock frequency is 100 MHz; 16 bit×16bit multiplier need 1 clock cycle):

    Table I LTE scenario assumptions for DBB reference design (LTE FDD macro cell)

    Step 1– Initial allocation: Initially, time constraint of the whole UL or DL PHY (1000 μs) is equally divided into 8 or 9 parts according to the number of DL PHY modules or UL PHY modules. Therefore, each module has an identical time slot, i.e., 1000/8=125μs (DL) or 1000/9=111.11μs (UL).

    Step 2– Algorithm and microarchitecture selection: Algorithm and microarchitecture of each module are selected to satisfy the allocated time constraint under the workload for 20MHz bandwidth. When selecting microarchitecture and algorithm for each module, the execution time models proposed in 3.1 are utilized to estimate the execution time for each microarchitecture. For instance, suppose there are two alternatives in DL TBCRC microarchitecture set, i.e., LFSR and 128-way data parallel CRC architecture [25], then LFSR needs about 33456 cycles (334.56μs) and the other needs 262 cycles (2.62μs) to compute 33456bit data. Thus, the latter is chosen as the candidate microarchitecture for TBCRC module,to follow the requirements 334.56μs>125μs and 2.62μs<125μs. In our reference design,algorithm and microarchitecture are selected by using the optimization method presented in section IV.

    Table II Time allocation for DL PHY modules in eNB (TB: transfer block, CB:code block, b: bit, sc: symbol with complex data and 16bits precision, sr: symbol with real number data and 16bits precision, ---: no limit, Pt: degree of task parallelism, Pd: degree of data parallelism)

    Step 3– Parallelization: Generally, for increasing the throughput of an original module without task level parallelization, data parallelization will be applied first. Like the TBCRC example shown, LFSR as an original microarchitecture computes one input bit at a time and the other parallel structure computes 128 bit in parallel. Hence, the microarchitecture set of a module is often with the varied degree of data parallelism. However, if data parallelization is unavailable, then we have to increase task level parallel to reduce execution time. Task parallel is like a “duplicate” operation that clones a copy of an existing module,leading to two or more modules running in parallel. Thus, the power consumption increases along with increment of task level parallel.As previously mentioned in section IV, under timing constraints, both microarchitecture selection and task parallelization of each PHY module aim at minimizing the energy/power consumption of the whole baseband. It is a NP-hard optimization problem. By using above optimization method, suitable microarchitecture and degree of task parallelism of each PHY module are determined and listed in Table II, III. Meanwhile, time allocation of our reference design is fulfilled as well, shown in Figure 9 and column ‘Time allocation’ in Table II, III.

    In Table II, LUT is lookup table for modulation and precoding respectively, and the table stores complex number and codebook specified in [23]; symbol shaping filter is a 26-taps FIR filter.

    In Table III, DFE is simplified as a 112-taps FIR equivalent to all functions including decimator, rotator, farrow filter and channel filter; matrix inversion of LMMSE is QR decomposition based on Gram-Schmidt; in terms of channel decoding module, a task is a single SISO turbo decoder with 10 decoding iterations; the data precision of LLR (soft bits) is 9 bit.

    5.1.2 System level power estimation and analysis

    Once microarchitecture and task parallel for each PHY module is determined, the logic gates, macro blocks (multipliers, adders and registers) and memory access operations inside a microarchitecture can be precisely counted. For DSP and ASIC platforms, power consumption of each module can be directly estimated by using the power model with specified standard cell library of a CMOS IC technology. Whereas, for FPGA platform, it is usually hard to obtain the accurate logic elements count without the help of synthesize tool. Thus, this paper offers a fast and near accurate power estimation method for FPGA platform, before logic synthesis, based on reuse block at system level as follows.

    First, all of basic components (e.g., adder)used in microarchitecture of all PHY modules are collected. The power consumption of each basic component is measured. In our reference design, the power consumption of 32bit full adder, 16bit multiplier, 16bit D Flip-flop,XOR gate, digital comparator and memory operation is measured on specified FPGA and normalized by the multiplier power. Power consumption of other basic components is negligible. FPGA technology weighting factor is measured, for example, a 16bit×16bit multiplier is 0.0303 mW at 100 MHz in AlteraTMStratix IV GX FPGA. Normalization power consumption arePmul=1/MHz,Padd= 0.6/MHz,Pxor= 0.012/MHz,Pdff= 0.027/MHz,Pcomparator= 0.02 /MHz,Pmemory_access=0.33/Kbit /MHz.

    Next, with normalized components power and the optimization results of microarchitecture selection with specified task parallel,relative power estimation of all PHY modules can be achieved using our power model (3),as illustrated in Figure 10. In Figure 9, the higher task parallel and data parallel is, the less execution time is. Meanwhile, the power dissipation of corresponding module in Figure10 gets higher. On the other hand, power consumption greatly depends on algorithm complexity as well. For instance, even though the execution time of channel equalization module is not yet minimized, its power consumption is sufficiently high due to high complexity of Gram-Schmidt algorithm. Hence, if designers intend to further minimize the execution time of this module without increasing power,Gram-Schmidt has to be replaced by low complexity algorithm. .

    Table III Time allocation for UL PHY modules in eNB (DFE: digital front-end,LMMSE: linear minimum mean square error, MAP: maximum a posteriori, sc/u:symbol/user, b/s: bit/user)

    Fig. 9 Time allocation with specific data and task parallel

    Fig. 10 Proportional distribution of relative power estimation

    Fig. 11 Test bench based on DE4 power measurement

    Finally, one or more modules (reuse blocks)are selected to perform the absolute power calibration (to be discussed in section 5.1.3),and power estimation of the rest modules (new design blocks) can be directly derived from the power calibration of reuse blocks and the relative power ratio in Figure 10.

    5.1.3 Experiment setup and power calibration

    To illustrate the calibration method, AlteraTMDE4 (2010 products) development board with Stratix IV GX FPGA is chosen to measure and evaluate the power consumption of selected modules. There are 12 power supply rails with on-board voltage and current sense capabilities. Built-in 24-bit ADC devices and provided voltage/current measurement methods are used. A power test bench is set up, shown in Figure 11. Downloading the netlist into FPGA core, i.e., device under test (DUT), and feeding random stimuli patterns (data switching activity is 0.5) to DUT via ‘input’ port. The differences (keeping the same FPGA mapping and layout for calibrations with/without DUT)of the electric current on sense resistor of the power supply rail are collected by ADC and transferred to PC. Note that the power measurement of DUT does not contain the power of pins, global wire, main memory and other peripherals on board.

    Once the power figures of selected modules are obtained, the power dissipation of all modules in the reference design can be indirectly figured out according to the ratio of power consumption exposed in figure 10. In this paper, channel decoding module is measured.

    Actually, a 12 soft-input soft-output (SISO)decoder in parallel was used for measuringpower dissipation. To reduce measurement error, the power data collection comprised 100 test groups and each test group contained 10 samples. Each test group was marked with the average of 10 samples. Finally, the mean value of 100 test groups was the final measured power of the channel decoding module.Additionally, there were two cases in each test group:

    Table IV Comparison of PA power estimation (σfeed: feeder loss, ηeffi ciency: PA efficiency, PPA: PA power consumption. The carrier frequency=2.1 GHz is assumed in [3], and other parameters (σfrequency, σpassive, σPPA) in (13) are default values.)

    1) After ‘Reset’ DUT, ‘Input’ and ‘CLK’ports are enabled, and thus the dynamic power combined with static power is measured;

    2) After ‘Reset’, disabling the ‘Input’ port and ‘CLK’ port and thus static power of the FPGA core is measured.

    Hence, the dynamic power of DUT can be computed by means of the whole power measured in 1) subtracts the static power in 2). The power measurement of 1) and 2) was respectively sampled 10 times for each test group.

    Through sufficient measurements, the power consumption of 12 parallel SISOs at 100MHz was 1.02W. The power of 50 SISOs was figured out by scaling it up, i.e., 4.25W.Finally, the dynamic power consumption of all PHY modules in our reference design was 4.25/0.134=31.7 W (the relative power proportion of the channel decoding module is 13.4%in Figure 10).

    The main problems when comparing our power estimation with measured DBB power in other literatures are: the difference lie in 1) algorithm selection, 2) microarchitecture selection and 3) platform selection. However, there is still similarity in communication standard and extreme conditions. In [3], even though the processors solution for DBB is used for power measurement (in 2010), there is a slight difference between their dynamic power 29.6 W (macro cell) and our estimation 31.7 W. This is mainly because both designs are for LTE standard under the same extreme conditions, i.e., peak traffic load. Moreover,we selected FPGA product (Stratix IV GX of 2010) in measurement to further match silicon technologies in 2010. Actually, with the increasing of reuse blocks in new design,more and more low-level physical information can be obtained in early system design stages.Therefore, more accurate power/energy estimation can be achieved.

    5.2 Power amplifier

    Table IV shows the comparison results for PA power estimation of various LTE base station types by using our model (13) and PA power model presented in [3]. The purpose is to validate our PA power model.

    --Macro cell and RRH (remote radio head): Since the digital pre-distortion and Doherty structure are both taken into account for macro cell and RRH in [3], the power efficiency is improved. Thus σstructure=0.45(Doherty structure) in our PA model (13) is chosen to estimate power consumption of macro cell and RRH. As shown in Table IV,our power estimations of them are slightly higher than the measured values in [3], i.e.,132.8>128.2 and 66.6>64.4, which arises from the digital pre-distortion. Power estimation errors are (132.8-128.2)/132.8=3.5% (macro),(66.6-64.4)/66.6= 3.3% (RRH), respectively.

    --Micro cell: Though the digital pre-distortion and Doherty structure are also used for micro cell [3], uncertainty factors result in obvious PA efficiency reduction by comparison with RRH, i.e., 22.8%<31.1%. Then, σstruc-ture=0.45 may be not appropriate in terms of a(27.7-21)/27.7=24% power estimation error.

    --Pico cell and Femto cell: In [3], the high operating back-off gives rise to poor power efficiency in smaller base station types due to the lack of advanced structures and digital pre-distortion support. Thus σstructure=0.1 in our PA model is a suitable choice for power estimation of pico/femto base stations. Power estimation errors are (1.95-1.9)/1.95=2.6%(pico) and (1.1-0.75)/1.1=31.8% (femto), respectively.

    The other information involved in our PA model is not exposed in [3], e.g., passive component loss factor, otherwise, we can further fine-tune other parameters in the PA model.The power estimation error will be further reduced. (More PA details can be found in [18])

    VI. CONCLUSION

    Even though 5G is not yet standardized, some key technologies have been released in [31][32][33]. Then some design explorations should and can be conducted, e.g., building algorithm/microarchitecture sets, finding the entry point of low power design, making a power budget of pre-designed base stations, etc.Obviously, such explorations can benefit from our models. We will introduce and discuss reference system designs for 5G scenarios in our future work.

    In this paper, a holistic power/energy estimation model is proposed for base station system early-stage design. Even without sufficient physical information, fast and near accurate power/energy estimation can be achieved at system level by using this model. This model opens up more opportunities, comparing to existing models, for system designers and users to explore their energy-efficient technologies for 4G and early-stage designs of the future base stations. Additionally, an energy/power optimization method based on our DBB energy model is proposed. By this optimization method, designers can rapidly evaluate distinct DBB design alternatives and find optimal DBB designs for energy minimization.

    A software platform based on the proposed models and optimization methods has been developed. We will find more application scenarios for energy saving during designs and operations.

    ACKNOWLEDGEMENTS

    The finance supporting from National High Technical Research and Development Program of China (863 program) 2014AA01A705 is sincerely acknowledged by authors.

    Note

    1. The main abbreviations of the paper are listed here.

    [1] 3GPP TR 36.927,“Potential Solutions for Energy Saving for E-UTRAN”, 2012,www.3gpp.org/dynareport/36927.htm

    [2] J.G Andrews, S. Buzzi, W. Choi,et al, “What Will 5G Be?”.IEEE Journal on Selected Areas in Communications, vol.32, no.6, pp 1065–1082, June,2014.

    [3] G. Auer, V. Giannini, C. Desset,et al, “How Much Energy Is Needed to Run a Wireless Network”,IEEE Wireless Communications, vol. 18, no. 4, pp 40-49, Oct. 2011.

    [4] O. Arnold, F. Richter, G. Fettweis,et al, “Power Consumption Modeling of Different Base Station Types in Heterogeneous Cellular Networks”,IEEE 2010 Future Network & Mobile Summit. Florence, June, 2010, pp 1–8.

    [5] C. Desset, B. Debaillie, V. Giannini,et al, “Flexible Power Modeling of LTE Base Stations”,IEEE Wireless Communications and Networking Conference (WCNC), Apr. 2012, pp 2858–2862.

    [6] R.V.R Kumar, J. Gurugubelli, “How Green the LTE Technology Can Be?”,IEEE 2011 2nd International Conference on Wireless Communication,Vehicular Technology, Information Theory and Aerospace and Electronic Systems Technology,Mar. 2011, pp 1-5.

    [7] H. Holtkamp, G. Auer,et al, “A Parameterized Base Station Power Model”,IEEE Communications Letters, vol.17, no.11,pp 2033–2035, Nov.2013.

    [8] H.K Boyapati, R.V Rajakumar, S. Chakrabarti,“Quantifying the Improvement in Energy Savings for LTE eNodeB Baseband Subsystem with Technology Scaling and Multi-core Architectures”,IEEE 2012 National Conference on Communications (NCC),Feb. 2012, pp 1-5.

    [9] H.K Boyapati, R.V.R Kumar, S. Chakrabarti,“Characterization of Baseband Implementations Using Energy Efficiency Metric Based Quantification”,IEEE 2011 International Conference on Devices and Communications (ICDeCom), Feb.2011, pp 1-5.

    [10] K. Keutzer, A.R Newton, J.M Rabaey, et al,“System-Level Design: Orthogonalization of Concerns and Platform-based Design”,IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,vol.19, no.12, pp 1523–1543, Dec. 2000.

    [11] W. Wang, X.Y Li, D. Liu,et al, “Multilevel Power Modeling of Base Station and Its ICs”,China Communications, vol. 12, no.5, pp 22–33, May,2015.

    [12] “International Technology Roadmap for Semiconductors (2007), Design”, 2007, http://www.itrs2.net/itrs-reports.html

    [13] N.A Sherwani, “Algorithm for VLSI Physical Design Automation”,Dordrecht, NED. Kluwer Academic Publishers, 2002.

    [14] D. Liu, “Baseband ASIP Design for SDR”,China Communications, vol.12, no.7, pp 60–72, July,2015.

    [15] “International Technology Roadmap for Semiconductors (2013), Interconnect”, 2013, http://www.itrs2.net/itrs-reports.html

    [16] B. Murmann, “ADC Performance Survey 1997-2016”, http://web.stanford.edu/~murmann/adcsurvey.html

    [17] H.S Bennett, R. Brederlow, J.C Costa,et al, “Device and Technology Evolution for Si-based RF Integrated Circuits”,IEEE Transactions on Electron Devices, vol.52, no.7, pp 1235–1258, July,2005.

    [18] S.C Cripps, “RF Power Amplifiers for Wireless Communications”,Norwood, MA,USA. ARTECH HOUSE, 2006.

    [19] F. Boccardi, R. Heath, A. Lozano,et al, “Five Disruptive Technology Directions for 5G”,IEEE Communications Magazine, vol. 52, no. 2, pp 74–80, Feb. 2014.

    [20] P. Coussy, A. Morawiec, “High-Level Synthesis From Algorithm to Digital Circuit”,Springer,2008.

    [21] B. Debaillie, A.Giry, M.J Gonzalez,et al, “Opportunities for Energy Savings in Pico/Femti-cell Base-Station”,IEEE Future Network & Mobile Summit 2011 (FutureNetw), June, 2011, pp 1–8.

    [22] Telecommunications Industry Association, “TR-45 Ad-Hoc Group on International Mobile Telecommunications, TR-45 -AHIMT (2008)”,ftp.tiaonline.org/TR-45/TR-45_AHIMT/Public/TR-45_Final_Report.pdf

    [23] 3GPP TS 36.211, “Evolved Universal Terrestrial Radio Access (E-UTRA):Physical channels and modulation”, 2010, http://www.3gpp.org/ftp/Specs/archive/36_series/

    [24] 3GPP TS 36.212, “Evolved Universal Terrestrial Radio Access (E-UTRA):Multiplexing and channel coding”, 2010, http://www.3gpp.org/ftp/Specs/archive/36_series/

    [25] Y.H Huo, X.Y Li,W. Wang,et al, “High Performance Table-Based Architecture for Parallel CRC Calculation”,The 21st IEEE International Workshop on Local and Metropolitan Area Networks, Apr. 2015, pp 1–6.

    [26] Y.T Hwang, J.Y Chen, M.H Sheu, “Automatic Generation of Programmable Parallel CRC &Scrambler Designs”,2006 IEEE Workshop on Signal Processing Systems Design and Implementation (SIPS), Oct. 2006, pp 286–291.

    [27] C. Yu, M.H Yen, “Area-Efficient 128- to 2048/1536-Point Pipeline FFT Processor for LTE and Mobile WiMAX Systems”,IEEE Transactions on Very Large Scale Integration (VLSI) Systems,vol. 23, no. 9, pp 1793–1800, Sep. 2015.

    [28] Ji.N Chen, J.H Hu,et al, “Hardware Efficient Mixed Radix-25/16/9 FFT for LTE Systems”,IEEE Transactions on Very Large Scale Integration(VLSI) Systems,vol. 23, no. 2, pp 221–229, Feb.2015.

    [29] A. Gomaa, L.M.A Jalloul, M. Mansour, “Max-Log-MAP Optimal MU-MIMO Receiver for Joint Data Detection and Interferer Modulation Classification”,IEEE Communications Letters, vol. 20,no. 7, pp 1389–1392, July, 2016.

    [30] Z.Z Wu, D. Liu, “High-Throughput Trellis Processor for Multistandard FEC Decoding”,IEEE Transactions on Very Large Scale Integration(VLSI) Systems, vol. 23, no. 12, pp 2757–2767,Dec. 2015.

    [31] IMT-2020 (5G) Promote Group, “White Paper on 5G Wireless Technology Architecture”, 2015.

    [32] P. Z Fan, J. Zhao,et al. “5G High Mobility Wireless Communications: Challenges and Solutions”, [J]China Communications, Vol. 13, no.Supplement2, pp: 1-13, Jan. 2017.

    [33] X. Meng, J. Li,et al. “5G Technology Requirements and Related Test Environments for Evaluation”, [J]China Communications, Vol. 13, no.Supplement2, pp: 42-51, Jan. 2017.

    亚洲美女视频黄频| 青春草视频在线免费观看| 亚洲精品国产av成人精品| 久久久久久国产a免费观看| 欧美日韩精品成人综合77777| 麻豆久久精品国产亚洲av| 日韩国内少妇激情av| av黄色大香蕉| 青春草国产在线视频 | av在线播放精品| 亚洲无线在线观看| 一区二区三区高清视频在线| 波多野结衣高清作品| 91aial.com中文字幕在线观看| 国产精品1区2区在线观看.| 人人妻人人看人人澡| 在线播放国产精品三级| 秋霞在线观看毛片| 久久国内精品自在自线图片| 嫩草影院新地址| 国产男人的电影天堂91| 97超视频在线观看视频| 国产成人91sexporn| 久久99热这里只有精品18| 久久久成人免费电影| 国产一区二区三区av在线 | 日韩欧美三级三区| 久久中文看片网| 亚洲欧美精品专区久久| h日本视频在线播放| 成年版毛片免费区| 成人高潮视频无遮挡免费网站| 国产av麻豆久久久久久久| 日韩欧美国产在线观看| 色视频www国产| 可以在线观看的亚洲视频| 久久国内精品自在自线图片| 欧美日韩乱码在线| 国产高清视频在线观看网站| 高清日韩中文字幕在线| 99热网站在线观看| av国产免费在线观看| 22中文网久久字幕| 免费av不卡在线播放| 天堂av国产一区二区熟女人妻| 丝袜喷水一区| 性欧美人与动物交配| 非洲黑人性xxxx精品又粗又长| 国产成人影院久久av| 亚洲精品影视一区二区三区av| 狠狠狠狠99中文字幕| 九九久久精品国产亚洲av麻豆| 一区二区三区免费毛片| 国产极品精品免费视频能看的| 晚上一个人看的免费电影| 岛国毛片在线播放| 欧美另类亚洲清纯唯美| 桃色一区二区三区在线观看| 麻豆乱淫一区二区| 女人被狂操c到高潮| 国产一级毛片在线| 国产视频内射| 嘟嘟电影网在线观看| 亚洲第一电影网av| 久久久久久久久久久丰满| 老女人水多毛片| 国产综合懂色| 精品久久久久久成人av| 99热6这里只有精品| 亚洲国产色片| 久99久视频精品免费| 网址你懂的国产日韩在线| 免费av观看视频| 欧美日韩国产亚洲二区| 欧美高清成人免费视频www| 国产激情偷乱视频一区二区| 午夜精品一区二区三区免费看| 国产精品电影一区二区三区| 日韩视频在线欧美| 国产色婷婷99| 成人一区二区视频在线观看| 联通29元200g的流量卡| 日本一本二区三区精品| 国产在线精品亚洲第一网站| 欧美成人免费av一区二区三区| 国产av一区在线观看免费| 成人二区视频| 欧美丝袜亚洲另类| 麻豆国产av国片精品| 毛片一级片免费看久久久久| 夜夜爽天天搞| 亚洲aⅴ乱码一区二区在线播放| 91久久精品国产一区二区成人| 在线观看美女被高潮喷水网站| 亚洲高清免费不卡视频| 亚洲最大成人手机在线| 1024手机看黄色片| 国产亚洲av嫩草精品影院| 欧美色视频一区免费| 国产精品一区www在线观看| 男人和女人高潮做爰伦理| 精品一区二区三区视频在线| 国产黄色视频一区二区在线观看 | 夜夜爽天天搞| 免费观看的影片在线观看| 午夜免费激情av| 久久午夜亚洲精品久久| 国产亚洲精品久久久com| 三级毛片av免费| 久久久国产成人精品二区| 国产在视频线在精品| 国产探花在线观看一区二区| 国产久久久一区二区三区| 久久99热6这里只有精品| 亚洲av不卡在线观看| 久久久久九九精品影院| 成人午夜高清在线视频| 成年女人永久免费观看视频| 在线观看一区二区三区| 欧美日本亚洲视频在线播放| 亚洲精品成人久久久久久| 九九久久精品国产亚洲av麻豆| 91麻豆精品激情在线观看国产| 99热只有精品国产| 亚洲18禁久久av| 亚洲无线在线观看| 欧美日韩乱码在线| 好男人视频免费观看在线| 亚洲国产色片| 中出人妻视频一区二区| 真实男女啪啪啪动态图| 午夜免费男女啪啪视频观看| 日韩,欧美,国产一区二区三区 | 日韩欧美在线乱码| 亚洲欧美日韩东京热| 欧美日韩综合久久久久久| 亚洲第一区二区三区不卡| 国产三级在线视频| 久久精品国产自在天天线| 菩萨蛮人人尽说江南好唐韦庄 | 波野结衣二区三区在线| 欧美区成人在线视频| 久久热精品热| 少妇人妻一区二区三区视频| 日日撸夜夜添| 亚洲av成人av| 一级二级三级毛片免费看| 好男人视频免费观看在线| 国产麻豆成人av免费视频| 麻豆av噜噜一区二区三区| 中文字幕制服av| 国产探花极品一区二区| av天堂在线播放| 一级毛片电影观看 | 在线观看一区二区三区| 亚洲国产欧洲综合997久久,| 18禁在线播放成人免费| 欧美成人一区二区免费高清观看| 久久久久九九精品影院| 国产成人aa在线观看| 国产单亲对白刺激| 亚洲天堂国产精品一区在线| 18+在线观看网站| 成年版毛片免费区| 在线观看av片永久免费下载| 国产三级在线视频| 菩萨蛮人人尽说江南好唐韦庄 | 狂野欧美白嫩少妇大欣赏| 国产精品久久视频播放| 精品一区二区三区视频在线| 亚洲四区av| 国产黄片视频在线免费观看| 亚洲国产精品成人综合色| 久久九九热精品免费| 久久综合国产亚洲精品| 狠狠狠狠99中文字幕| 中文字幕熟女人妻在线| 超碰av人人做人人爽久久| 18禁在线无遮挡免费观看视频| 成人性生交大片免费视频hd| 91精品国产九色| 欧美成人免费av一区二区三区| 亚洲人成网站高清观看| 赤兔流量卡办理| 伦理电影大哥的女人| 天堂中文最新版在线下载 | 精品久久久久久久久久久久久| 国产成人精品一,二区 | 草草在线视频免费看| 精品一区二区免费观看| 最新中文字幕久久久久| av又黄又爽大尺度在线免费看 | 日本五十路高清| 看非洲黑人一级黄片| 国产精品日韩av在线免费观看| 嫩草影院入口| 久久精品夜色国产| 欧洲精品卡2卡3卡4卡5卡区| 黄色欧美视频在线观看| 黄色一级大片看看| 国产精品一区二区三区四区免费观看| 国产精品久久久久久久电影| 自拍偷自拍亚洲精品老妇| 国产男人的电影天堂91| 久久久精品欧美日韩精品| 午夜福利视频1000在线观看| 亚洲av不卡在线观看| 久久久欧美国产精品| 亚洲成人av在线免费| 午夜福利在线在线| 91麻豆精品激情在线观看国产| 美女xxoo啪啪120秒动态图| 女人十人毛片免费观看3o分钟| 亚洲欧美日韩高清专用| 男人狂女人下面高潮的视频| 国产色爽女视频免费观看| 日韩国内少妇激情av| 亚洲精品乱码久久久久久按摩| 菩萨蛮人人尽说江南好唐韦庄 | 久久精品国产亚洲av涩爱 | 一边摸一边抽搐一进一小说| 人妻系列 视频| 亚洲精品乱码久久久v下载方式| 99久久精品一区二区三区| 国产精品.久久久| 欧美精品一区二区大全| 婷婷色综合大香蕉| 变态另类丝袜制服| 国产老妇伦熟女老妇高清| 色尼玛亚洲综合影院| 一级毛片aaaaaa免费看小| 亚洲欧美精品自产自拍| av国产免费在线观看| 色综合站精品国产| 给我免费播放毛片高清在线观看| 中出人妻视频一区二区| 国产av麻豆久久久久久久| 在线观看免费视频日本深夜| 久久99精品国语久久久| 日日撸夜夜添| 91久久精品国产一区二区成人| 亚洲久久久久久中文字幕| 熟女电影av网| eeuss影院久久| 久久久久久国产a免费观看| 丝袜美腿在线中文| 亚洲av电影不卡..在线观看| 国产伦理片在线播放av一区 | 国产 一区精品| 99久久精品热视频| 嫩草影院新地址| 麻豆国产av国片精品| 国产久久久一区二区三区| 色综合色国产| 村上凉子中文字幕在线| 波多野结衣高清作品| 99riav亚洲国产免费| 久久精品国产亚洲av天美| 精品久久久久久久久久免费视频| 国产精品电影一区二区三区| 欧美日韩综合久久久久久| 男女做爰动态图高潮gif福利片| 我要搜黄色片| 美女 人体艺术 gogo| 久久久精品大字幕| 免费无遮挡裸体视频| 哪个播放器可以免费观看大片| 久久综合国产亚洲精品| 精品人妻熟女av久视频| 免费观看精品视频网站| 成人亚洲欧美一区二区av| 亚洲国产精品sss在线观看| av在线天堂中文字幕| 韩国av在线不卡| 亚洲中文字幕日韩| 一进一出抽搐gif免费好疼| 久久久久久久久久黄片| 99久久无色码亚洲精品果冻| 午夜精品国产一区二区电影 | 九九爱精品视频在线观看| 国产成人一区二区在线| 国产精品无大码| 日本免费a在线| 干丝袜人妻中文字幕| 亚洲无线观看免费| 男人舔奶头视频| 精品久久国产蜜桃| 亚洲成人久久爱视频| 国产伦精品一区二区三区四那| 亚洲中文字幕一区二区三区有码在线看| 日韩欧美三级三区| 插阴视频在线观看视频| 国产亚洲av片在线观看秒播厂 | 搡女人真爽免费视频火全软件| 欧美zozozo另类| 亚洲国产精品久久男人天堂| 三级经典国产精品| 亚洲美女视频黄频| 一边摸一边抽搐一进一小说| 高清在线视频一区二区三区 | 亚洲在线自拍视频| 国产高清不卡午夜福利| 国内揄拍国产精品人妻在线| 免费搜索国产男女视频| 欧美成人免费av一区二区三区| 搞女人的毛片| 美女大奶头视频| 中国国产av一级| 久久草成人影院| 成人高潮视频无遮挡免费网站| 少妇丰满av| 一区二区三区四区激情视频 | 男人舔奶头视频| 我要搜黄色片| 亚洲国产精品久久男人天堂| 国产成人精品久久久久久| 麻豆国产97在线/欧美| 欧美色欧美亚洲另类二区| 成人欧美大片| 成人毛片a级毛片在线播放| 国产成年人精品一区二区| 午夜激情福利司机影院| 国产成人一区二区在线| 亚洲欧美精品专区久久| 免费大片18禁| 中出人妻视频一区二区| 国产精品1区2区在线观看.| 哪里可以看免费的av片| 久久久久性生活片| 波野结衣二区三区在线| av又黄又爽大尺度在线免费看 | av免费观看日本| 一个人免费在线观看电影| 最近的中文字幕免费完整| 赤兔流量卡办理| 此物有八面人人有两片| 国产精品女同一区二区软件| 亚洲欧美精品综合久久99| 一个人看的www免费观看视频| 97超视频在线观看视频| 国产黄色视频一区二区在线观看 | 国产av麻豆久久久久久久| 2022亚洲国产成人精品| 美女内射精品一级片tv| 欧美一级a爱片免费观看看| 久久久久久九九精品二区国产| 一个人免费在线观看电影| 久久久a久久爽久久v久久| 91av网一区二区| 最近的中文字幕免费完整| 九九在线视频观看精品| 99九九线精品视频在线观看视频| 国产高清三级在线| 男人舔女人下体高潮全视频| 国产麻豆成人av免费视频| 成年版毛片免费区| 爱豆传媒免费全集在线观看| 精品无人区乱码1区二区| 欧美激情久久久久久爽电影| 国产蜜桃级精品一区二区三区| 99在线视频只有这里精品首页| av在线亚洲专区| 久久精品国产自在天天线| 亚洲精品国产av成人精品| 啦啦啦观看免费观看视频高清| 大型黄色视频在线免费观看| 黄色视频,在线免费观看| 干丝袜人妻中文字幕| 国产精品乱码一区二三区的特点| 99热网站在线观看| 国产真实乱freesex| 国产伦理片在线播放av一区 | 中文字幕熟女人妻在线| 99久国产av精品| 波野结衣二区三区在线| 免费不卡的大黄色大毛片视频在线观看 | 精品免费久久久久久久清纯| 国产久久久一区二区三区| 亚洲国产高清在线一区二区三| 久久久久久久午夜电影| 成人鲁丝片一二三区免费| 夫妻性生交免费视频一级片| 亚洲熟妇中文字幕五十中出| 国产亚洲欧美98| 欧美区成人在线视频| 国产一区二区亚洲精品在线观看| 国产精品蜜桃在线观看 | 草草在线视频免费看| 亚洲国产精品成人久久小说 | 久久综合国产亚洲精品| 欧洲精品卡2卡3卡4卡5卡区| 日本黄大片高清| 深爱激情五月婷婷| 91精品国产九色| 亚洲电影在线观看av| 综合色av麻豆| 在线播放国产精品三级| 午夜福利在线在线| 成人av在线播放网站| 亚洲欧美成人精品一区二区| 精品国产三级普通话版| av女优亚洲男人天堂| 国产激情偷乱视频一区二区| 26uuu在线亚洲综合色| 成人av在线播放网站| www.av在线官网国产| 亚洲内射少妇av| 91精品国产九色| 免费观看在线日韩| 欧美性感艳星| 国内久久婷婷六月综合欲色啪| 中文亚洲av片在线观看爽| 国产一区二区三区在线臀色熟女| 在线播放国产精品三级| 国产av在哪里看| videossex国产| 久久久久免费精品人妻一区二区| 亚洲av男天堂| 国产视频内射| 永久网站在线| 精品免费久久久久久久清纯| 男女那种视频在线观看| 18禁黄网站禁片免费观看直播| 亚洲精品国产av成人精品| 亚洲av中文字字幕乱码综合| 男人舔女人下体高潮全视频| 18禁在线无遮挡免费观看视频| 赤兔流量卡办理| 日韩欧美一区二区三区在线观看| 少妇熟女欧美另类| 波野结衣二区三区在线| 国国产精品蜜臀av免费| 日本在线视频免费播放| 亚洲精品亚洲一区二区| 久久6这里有精品| 99精品在免费线老司机午夜| 久久这里只有精品中国| 啦啦啦啦在线视频资源| 人妻夜夜爽99麻豆av| 免费搜索国产男女视频| 国产精品人妻久久久久久| 亚洲18禁久久av| 欧美高清性xxxxhd video| 神马国产精品三级电影在线观看| 99久久精品国产国产毛片| 久久精品国产亚洲av涩爱 | 国产乱人偷精品视频| 国产精品福利在线免费观看| 成人漫画全彩无遮挡| 黄色一级大片看看| 色哟哟·www| 老师上课跳d突然被开到最大视频| 亚洲乱码一区二区免费版| 男人的好看免费观看在线视频| 91狼人影院| 国产一区亚洲一区在线观看| 免费看av在线观看网站| 午夜福利高清视频| 男女视频在线观看网站免费| 欧美高清成人免费视频www| 中文精品一卡2卡3卡4更新| 国内久久婷婷六月综合欲色啪| 只有这里有精品99| 99久国产av精品国产电影| 伦理电影大哥的女人| 舔av片在线| 联通29元200g的流量卡| 亚洲美女视频黄频| 麻豆成人午夜福利视频| 91精品一卡2卡3卡4卡| 亚洲精品久久久久久婷婷小说 | 久久精品国产鲁丝片午夜精品| 免费看光身美女| 天堂中文最新版在线下载 | 久久精品国产99精品国产亚洲性色| 九草在线视频观看| 成人漫画全彩无遮挡| 免费观看精品视频网站| 菩萨蛮人人尽说江南好唐韦庄 | 波多野结衣高清作品| 特大巨黑吊av在线直播| 99久久久亚洲精品蜜臀av| 91久久精品国产一区二区成人| 国产精品人妻久久久久久| 国产精品,欧美在线| 一卡2卡三卡四卡精品乱码亚洲| 色哟哟哟哟哟哟| 久久人人爽人人爽人人片va| 一卡2卡三卡四卡精品乱码亚洲| 国产精品福利在线免费观看| 亚洲成人久久爱视频| 国产极品精品免费视频能看的| 综合色av麻豆| 国产精品久久久久久久久免| 又粗又爽又猛毛片免费看| 毛片一级片免费看久久久久| 国产高清三级在线| 久久久久久久久大av| 精品99又大又爽又粗少妇毛片| 婷婷色综合大香蕉| 欧美成人精品欧美一级黄| 中文欧美无线码| 国产单亲对白刺激| av免费观看日本| 高清毛片免费看| 国产 一区精品| 熟女人妻精品中文字幕| 久久久欧美国产精品| 国国产精品蜜臀av免费| 欧美最黄视频在线播放免费| 色综合亚洲欧美另类图片| 成熟少妇高潮喷水视频| 久久国产乱子免费精品| 秋霞在线观看毛片| 国产乱人视频| 亚洲成人久久爱视频| 99久久久亚洲精品蜜臀av| 亚洲av免费高清在线观看| 99视频精品全部免费 在线| 最后的刺客免费高清国语| h日本视频在线播放| 一个人观看的视频www高清免费观看| 精品少妇黑人巨大在线播放 | 精品久久国产蜜桃| 国产精品麻豆人妻色哟哟久久 | 午夜视频国产福利| 麻豆国产97在线/欧美| 亚洲三级黄色毛片| 长腿黑丝高跟| 亚洲三级黄色毛片| 哪里可以看免费的av片| 黄色欧美视频在线观看| 97超视频在线观看视频| 国产色爽女视频免费观看| 我的女老师完整版在线观看| 又爽又黄a免费视频| 男女做爰动态图高潮gif福利片| 成人亚洲精品av一区二区| eeuss影院久久| 亚洲欧美中文字幕日韩二区| 精品熟女少妇av免费看| 亚洲最大成人手机在线| 中文亚洲av片在线观看爽| 一边摸一边抽搐一进一小说| 人人妻人人澡人人爽人人夜夜 | 真实男女啪啪啪动态图| 国产私拍福利视频在线观看| 亚洲第一电影网av| 国产成人91sexporn| 国内久久婷婷六月综合欲色啪| 亚洲美女视频黄频| 久久精品国产亚洲av天美| 精品国产三级普通话版| 亚洲美女搞黄在线观看| 欧美bdsm另类| 99久久人妻综合| 国产精品一及| 精品久久久久久久久久免费视频| 亚洲av免费在线观看| 国产色爽女视频免费观看| 深夜精品福利| 国产探花在线观看一区二区| 国产一区二区激情短视频| 国产午夜精品一二区理论片| 男女边吃奶边做爰视频| 国产av麻豆久久久久久久| 国产一区二区激情短视频| 久久久欧美国产精品| 欧美性猛交╳xxx乱大交人| 91狼人影院| 国产黄片视频在线免费观看| 亚洲色图av天堂| 免费搜索国产男女视频| 插逼视频在线观看| 男女视频在线观看网站免费| 日本成人三级电影网站| 亚洲四区av| 亚洲精品乱码久久久久久按摩| 欧美丝袜亚洲另类| 欧美日本亚洲视频在线播放| 中出人妻视频一区二区| 久久精品久久久久久噜噜老黄 | 亚洲精品456在线播放app| 美女大奶头视频| 校园春色视频在线观看| 久久这里只有精品中国| 久久久久久久久久久丰满| 亚洲av男天堂| 亚洲欧美日韩高清在线视频| 麻豆国产97在线/欧美| 日韩大尺度精品在线看网址| 一边摸一边抽搐一进一小说| 国产av一区在线观看免费| 男女做爰动态图高潮gif福利片| av黄色大香蕉| 久久精品影院6| 亚洲精华国产精华液的使用体验 | 日本黄色片子视频| 深夜a级毛片| 欧美极品一区二区三区四区| 少妇被粗大猛烈的视频| av黄色大香蕉| 波多野结衣高清作品| 久久久久久久久中文| 精品免费久久久久久久清纯| 波野结衣二区三区在线| 久久久色成人| 亚洲av中文av极速乱| 男人狂女人下面高潮的视频| 亚洲美女视频黄频| 久久久精品大字幕| 国产一区二区亚洲精品在线观看| 亚洲无线观看免费| 免费搜索国产男女视频| 成年版毛片免费区|