• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Large-Capacity and High-Speed lnstruction Cache Based on Divide-by-2 Memory Banks

    2022-01-08 13:06:28QingQingLiZhiGuoYuYiSunJingHeWeiXiaoFengGu

    Qing-Qing Li | Zhi-Guo Yu| Yi Sun | Jing-He Wei | Xiao-Feng Gu

    Abstract—An increase in the cache capacity is usually accompanied by a decrease in access speed.To balance the capacity and performance of caches,this paper proposes an instruction cache (ICache) architecture based on divide-by-2 memory banks (D2MB-ICache).The control circuit and memory banks of D2MB-ICache work at the central processing unit (CPU) frequency and the divide-by-2 CPU frequency,respectively,so that the capacity of D2MB-ICache can be expanded without lowering its frequency.For sequential access,D2MB-ICache can output the required instruction from memory banks per CPU cycle by dividing the memory banks with a partition mechanism and employing an inversed clock technique.For non-sequential access,D2MB-ICache will fetch certain jump instructions one or two more times,so that it can catch the jump of the request address in time and send the correct instruction to the pipeline.Experimental results show that,compared with conventional ICache,D2MB-ICaches with the same and double capacities show a maximum frequency increase by an average of 14.6% and 6.8%,and a performance improvement by an average of 10.3% and 3.8%,respectively.Moreover,energy efficiency of 64-kB D2MB-ICache is improved by 24.3%.

    Index Terms—Cache capacity expansion,divide-by-2 frequency,instruction cache (ICache),inversed clock.

    1.lntroduction

    In recent years,the bandwidth and speed of the main memory have become more and more difficult to provide required amount of data to processors,making the processors unable to exhibit the desired performance.To solve this problem,the cache memory is usually included in modern computer systems.An instruction cache (ICache) utilizes the principle of locality to store some parts of programs from the main memory,thereby greatly improving the execution speed of the programs.Caches can mitigate the performance discrepancy between processors and the main memory,which has an important impact on the overall performance of processors.

    Fig.1illustrates the structure of conventional 4-way set-associative ICache[1].Each memory address consists of the tag,set index,and byte offset.Basically,read operations in conventional ICache include addressing,reading the tag and data banks,judging the hit/miss state,and outputting the selected instruction.When there is a request,the tag and data banks will be addressed according to the“set index” and “set index+byte offset”,respectively.Then,the tag value extracted from the memory address will be compared with the tag values read from the tag bank.If the tag hits,the corresponding instruction will be selected;if the tag misses,the request will be forwarded to the lower level memory.The cache performance is generally determined by the hit ratio,latency,speed,and power consumption[2]-[4].The large cache capacity is provided to mitigate capacity misses and improve the hit ratio.However,the increase in the cache size will increase the access time,restricting the development of high-frequency processors[5].Since more and more advanced processors with high data bandwidth desire a larger and faster cache[4],[6],[7],it becomes necessary to explore a tradeoff between the cache capacity and performance.

    On the one hand,researchers have attempted to make full use of the limited cache space and optimize the cache performance,for example,by improving cache replacement algorithms[8]-[10],developing prefetching mechanisms[11],[12],and introducing data compression techniques[13],[14].The cache replacement algorithms and prefetching mechanisms effectively manage the limited L1 cache and decrease the missing penalty.Nevertheless,the dilemma of the L1 cache capacity still exists.The data compression can increase the effective (logical) capacity of caches,whereas it brings the decompression latency.The compression schemes are more suitable for the last-level caches (LLCs) which focus on minimizing the miss rate[15].On the other hand,alternative memory technologies have also been explored[16]-[21]since the static random access memory (SRAM) cannot exhibit ideal performance due to its low density and high leakage power in the ultradeep submicron processes.Owing to their negligible leakage power and high density,non-volatile memories(NVMs) are considered to be applied to many fields in the future[22].However,the endurance of NVMs is insufficient,which is the major challenge for NVMs to replace SRAM.As a result,SRAM is still the most widely used in the highest-level cache.

    This paper proposes a large-capacity and high-speed ICache architecture based on divide-by-2 memory banks (D2MB-ICache).Different from conventional ICache,D2MB-ICache operates at the central processing unit (CPU) frequency and its memory banks operate at the divide-by-2 CPU frequency.This method can expand the D2MB-ICache capacity without lowering its frequency.Compared with conventional ICache,D2MB-ICache takes two CPU cycles per fetching,which halves the speed of instruction fetch,as shown inFig.2.We hereby achieve a large-capacity and high-speed D2MB-ICache architecture through the following three contributions.

    1) To avoid missing requests when they cross the CPU clock and the memory bank clock,the data bank and the tag bank are divided according to a partition mechanism.

    Fig.1.Conventional 4-way set-associative ICache.

    Fig.2.Read policy in different ICaches:(a) conventional ICache and (b) D2MB-ICache.

    2) D2MB-ICache triggers memory banks with inversed clocks and allocates instructions to adjacent memory banks so that the read operations in D2MB-ICache are similar to conventional ICache.The required instruction is taken from the memory banks per CPU cycle unless there is a non-sequential request address.

    3) A jump replay (JR) module is designed to avoid missing the read request when the access address is non-sequential.The JR module can detect the change of the request address and control the read operations in D2MB-ICache.

    The performance evaluation results indicate that,in the 55-nm Semiconductor Manufacturing International Corporation (SMIC) complementary metal-oxide-semiconductor (CMOS) process,D2MB-ICaches with the same (1×) and double (2×) capacities can increase the maximum frequency and reduce the execution time compared with conventional ICache.Duadruple-size D2MB-ICache shows almost no reduction in the maximum frequency and execution time.Moreover,the proposed ICache with a large capacity has a significant improvement on energy efficiency.The rest of this paper is organized as follows.Section 2 describes the architecture of D2MB-ICache.In Section 3,the write/read operations in D2MB-ICache are presented.Experimental results are described and explained in Section 4.Section 5 discusses related research work.Finally,the conclusion is drawn in Section 6.

    2.D2MB-lCache Architecture

    As shown inFig.3,the tag and data banks in D2MB-ICache are divided into multiple small memory banks that operate at the divide-by-2 frequency.Besides,a JR module and an address change judgment for read operations are added to detect how Req_addr changes and determine whether D2MB-ICache repeatedly accesses a non-sequential address.If a request to D2MB-ICache occurs,a cache line is selected after addressing.Meanwhile,the JR module compares the lowest bit of “index” (TS) and the lowest log2Nbits of “i ndex+offset ?log2(LBW/8)”(DS),which are detailedly described in subsection 2.1,in the current request address with the address in S1.Based on the change of the request address,the JR module will send three signals (S1_kill,S1_replay,and S2_replay) to control whether the access operation needs to be repeated.Finally,theproposed ICache either outputs the corresponding instructions or forwards the request to the next level memory.Table 1depicts some abbreviations in this work.

    Table 1:Abbreviations in the work

    Fig.3.D2MB-ICache architecture.

    2.1.Partition Mechanism of Memory Banks

    In conventional ICache,the operating frequency of memory banks is usually the same as that of CPU.Conventional ICache could store SBW-bit data or load LBW-bit data per CPU cycle.However,D2MB-ICache needs to store (2×SBW)-bit data per memory banks’ clock cycle.If the number of data memory banks in D2MB-ICache is still SBW/LBW,the written data will be lost.To solve this problem,the tag and data banks are divided into two tag memory banks (T0and T1) andNdata memory banks (D0,D1,···,DN-1),respectively.The width of data memory banks is LBW bits.The parameterNis expressed as follows:

    Since the data packet transferred between D2MB-ICache and CPU is LBW bits,the address offset of the next sequential access is LBW/8.The instructions are stored in the location specified by“index+offset ?log2(LBW/8)” of the request address.As shown inFig.3,D2MB-ICache utilizes DS to select the corresponding data memory bank.DS from the request address is defined as follows:

    The tag is stored in the location specified by “i ndex”.Therefore,the tag memory banks can be enabled via TS inFig.3of “i ndex”.TS is defined as follows:

    2.2.lnversed Clock Technique

    Conventional ICache can receive a new request and output an instruction per CPU cycle.However,in D2MB-ICache,the memory banks operate at the divide-by-2 CPU frequency and take two CPU clock cycles to read data.To make D2MB-ICache send an instruction to the pipeline per CPU cycle,we adopted an inversed clock technique.In this work,a part of the memory banks (D0,D2,···,DN-2,and T0) operate at clk1,which is divided by two from the CPU clock,clk.The other part of the memory banks operate at an inversed clock,clk2,which is offset by 180° from clk1.In addition,the instructions are distributed to adjacent data memory banks in order.Accordingly,as shown inFig.4,adjacent memory banks can sequentially transfer data to the output port of D2MB-ICache per CPU cycle when the request addresses are sequential.

    Fig.4.Inversed clock for D2MB-ICache’s memory banks.

    2.3.JR Module

    When non-sequential access occurs,the same divide-by-2 data (or tag) memory bank may be accessed continuously,which results in missing the requests.Therefore,we designed a JR module in the proposed ICache.Firstly,the JR module judges the relationship between the memory banks accessed in S0 and S1 by comparing the TS and DS bits in Req_addr_S0 and Req_addr_S1.Then,to avoid missing nonsequential access,the JR module outputs S1_replay or S2_replay and determines the number of times repeatedly accessing a non-sequential request address.When S2_replay is set to 1,D2MB-ICache will access Req_addr_S2 twice.When S1_replay is set to 1,D2MB-ICache will access Req_addr_S1 one more time.Besides,when the JR module controls D2MB-ICache to repeatedly access an address,the access request in S1 is incorrect.Therefore,the JR module sends the signal S1_kill to indicate that the access operation in S1 is invalid.These three control signals ensure that each valid non-sequential access can be implemented in the memory banks.The access operations controlled by the JR module will be elaborated in subsection 3.2.

    3.Access Operations in D2MB-lCache

    3.1.Write Operations

    In conventional ICache,when the cache miss happens,the write operation is performed every CPU cycle and lasts for BS/SBW times.In D2MB-ICache,D2MB takes two CPU cycles to complete a write operation.Therefore,in the two CPU cycles,there are two write requests,and (2 ×SBW)-bit data should be written into the memory banks.To avoid missing the write requests,the (2 ×SBW)-bit data are separated into three parts and written into the data memory banks in four CPU cycles.The progress of write operations in data memory banks is illustrated inFig.5.The signals,data_bank_wmode_0,data_bank_wmode_1,and data_bank_wmode_2,are used to enable the write modes of DP(P=0,2,···,(N/2) -2),DMand D2M+1(M=1,3,···,(N/2) -1),and D2P+2,respectively.When there is a write request,SBW-bit input data (Din) from the lower level memory are split intoN/2 data (LBW-bit) and written into the corresponding data memory banks.In D2MB-ICache,the write operations last for (BS/SBW)+2 CPU cycles.For instance,when SBW=64 and LBW=32,Nis equal to 4 and the 64-bit data_input is split into two 32-bit instructions.On the first rising edge of clk (the first rising edge of clk1),the low 32-bit Din0 is written into D0.On the second rising edge of clk(the first rising edge of clk2),the high 32-bit Din0 and the high 32-bit Din1 are written into D1and D3,respectively.On the third rising edge of clk (the second rising edge of clk1),the low 32-bit Din1 is written into D2.

    Fig.5.Timing of write operations in data memory banks.

    Although this mechanism can ensure the correct data written into the data memory banks,it may increase the write latency.Algorithm 1 inTable 2shows that there are three cases for the read request after the write operation.When DS[0]=1’h1,the read operation is delayed by one CPU cycle,and the required instruction is taken from DMor D2M+1(rows 2 to 4 in Algorithm 1).When DS[1:0]=2’h2,the read operation is delayed by two CPU cycles,and the required instruction is taken from D2P+2(rows 5 to 7).There is no extra latency when the required instruction is stored in DP(rows 8 to 9).Usually,the first read access to D0after write operations is performed when the sequentially fetched instructions miss,and the first read access to other data banks after write operations occurs only when taken branch instructions miss.Because the ICache miss almost occurs during the sequential access,the write operations only increase a little latency in D2MB-ICache.As for tag memory banks,the corresponding tag value is written into the same address in BS/SBW CPU clock cycles.Therefore,the write operations in the tag memory banks have no latency.

    Table 2:Algorithm 1

    3.2.Read Operations

    3.2.1.Access to Data Memory Banks

    According to the changes of the request address,the current instruction may be stored in 1) the data memory bank triggered by the inversed clock of the data memory bank accessed in S1,2) other memory banks with the same clock as the last memory bank accessed,or 3) the same memory bank in which the last instruction is stored.The above three changes are defined as Case1,Case2,and Case3,respectively.The changes of the request address during the sequential fetching belong to Case1.The instruction in S1 has to be a jump instruction in Case2 and Case3.It should be aware that the proposed architecture cannot be applied whenN=1,and there is no Case2 whenN=2.

    In the pipeline,some instructions may be executed for several cycles,making a request address maintain several cycles.Therefore,a parity counter is included in the JR module to detect the number of consecutive access to a request address.Fig.6depicts the control flow for the read access to the data memory banks.The parity counter will output a signal,repeat_data_even,to indicate that Req_addr_S1 has been continuously accessed for even times.According to the state of repeat_data_even,two modes are defined:The even mode (repeat_data_even=1) and the odd mode(repeat_data_even=0).When there is a read request to D2MB-ICache,the JR module detects the parity counter firstly.In the odd mode,when the change of Req_addr is Case1,the read operation is normal and the proposed ICache can read the required instruction from the data memory bank every CPU cycle (Operation1,rows 1 to 3 in Definition 1 shown inTable 3);when the change of Req_addr is Case2,the JR module sends S1_replay and S1_kill to D2MB-ICache in the next CPU cycle,which can maintain Req_addr_S1 for one more CPU cycle and invalidate the request in S0 (Operation2,rows 4 to 7);when S1 is invalid,the read operation in Case3 is Operation2;when S1 is valid in Case3,the JR module sends S2_replay to D2MB-ICache to fetch the instruction in S2 for two more CPU cycles,and then it invalidates the request in S0 and S1(Operation3,rows 8 to 11).In the even mode,the read operation in Case1 is Operation2,and when the change of Req_addr is Case2 or Case3,the read operation is Operation1.

    Table 3:Definition 1

    3.2.2.Access to Tag Memory Banks

    The changes of Req_addr for tag memory banks are classified into two cases:The tag memory bank accessed in S0 is different from that accessed in S1(Case4) or not (Case5).As shown inFig.7,the read operations of the tag memory banks in D2MBICache are similar to those of the data memory banks.Compared with the read operation in the data memory banks,the JR module infers in which mode the tag memory banks work based on repeat_tag_even.Besides,the JR module deals with Case4 and Case5 in the same way as Case1 and Case3.

    Fig.6.Control flow for read operations in data memory banks.

    Fig.7.Control flow for read operations in tag memory banks.

    3.2.3.Timing of Read Operations

    The design of the clocks makes memory banks catch the changes of request addresses in time in Case1 and Case4 in the odd mode,which does not increase extra latency for read operations.Fig.8illustrates the timing of the read operations in the odd mode when Case1 occurs.D2MB-ICache can catch every request and read instructions from the data memory banks in every CPU cycle.

    The signals,S1_replay and S2_replay,can make D2MB-ICache not miss any jump of the request address and provide the pipeline with the correct instructions.Fig.9shows the timing of read operations in the odd mode when Case2 and Case3 occur.

    Fig.8.Timing of read operations in odd-mode Case1.

    Fig.9.Timing of read operations with S1_replay and S2_replay.

    We assume that Req_add0 and Req_addr1 are both mapped to D0(Case3),and Req_add0 and Req_addr1 are mapped to D1and D3(Case2),respectively.On the second rising edge of clk,CPU requests to access Req_addr1.It means that D0should be accessed on the falling edge of clk1,which may do not meet the timing requirements and cause the wrong output.To solve this problem,the JR module sets S1_kill to 1 and invalidates the second access request.The dotted line with an arrow inFig.9indicates that the current access request is invalid.Besides,the JR module uses S2_replay to control D2MB-ICache to access Req_addr0 and Req_addr1 again on the second and third rising edges of clk1,respectively.On the 7th rising edge of clk,CPU requests to access Req_addr3,which causes D3to be accessed on the falling edge of clk2.To ensure the correct output of the proposed ICache,the JR module sends S1_kill and S1_replay,and delays the access to Req_addr3.It can be found that S1_replay and S2_replay increase the access latency.Benefitting from the low taken branch instruction ratio[23],the increased execution cycles caused by D2MBICache have limited impacts on the overall performance of CPU.

    4.Experiment

    4.1.Experimental Framework

    The proposed method was implemented on Rocket-Chip[24],a single-issue in-order reduced instruction set computer five (RISC-V) processor.To evaluate the performance of D2MB-ICache,the RISC-V processor was simulated with the Synopsys verilog compiler simulator (VCS),and ten open-source benchmarks in Github were executed.The processor and its memory hierarchy configuration are described inTable 4.The executed benchmarks are described inTable 5.

    Table 4:Configuration of the simulated processor

    Table 5:Ten workloads used for evaluation

    4.2.Frequencies and Energy Consumption

    The memory banks are generated by the SMIC 55-nm single-port memory compiler.We performed the static timing and power analysis of different ICaches in the SMIC 55-nm standard cell library by Synopsys Design Compiler.Fig.10shows the comparison of the maximum frequencies between conventional 4-way set-associative ICache and D2MB-ICache with different capacities.With the same size,the proposed D2MBICache exhibits higher maximum frequencies than conventional ICache.For instance,128-kB D2MB-ICache can work at 794 MHz,which is 127 MHz higher than 128-kB conventional ICache.The average maximum operating frequencies of D2MB-ICaches with 1× and 2× capacities increase by 14.6% and 6.8% compared with conventional ICache,respectively.Moreover,the frequency of quadruple-size D2MB-ICache is almost as high as that of conventional ICache without capacity expansion.

    Based onFig.10,we evaluated the energy consumption of different ICaches operating at their maximum possible frequencies.Fig.11presents the total energy consumption normalized to that of 16-kB conventional ICache.The proposed ICaches with the capacities of 16 kB,32 kB,64 kB,and 128 kB save 0.5%,16.1%,24.3%,and 24.8% energy consumption versus the conventional ones,respectively.That is because although the frequency of the proposed ICache has increased,the frequency of D2MB consisting of D2MB-ICache has decreased.Eventually,the proposed ICache consumes less energy than conventional ICache.

    Fig.10.Maximum operating frequencies in various ICaches.

    Fig.11.Normalized energy consumption in various ICaches.

    Fig.12.Normalized execution cycles of processor applications with conventional ICache and D2MB-ICache.

    4.3.Execution Cycles

    The variation of execution cycles with different cache sizes and architectures was also analyzed.Processors with three ICache capacities (32 kB,64 kB,and 128 kB) were simulated by using the ten benchmarks presented inTable 5.Due to the existence of branch instructions in benchmarks,D2MB-ICache needs to fetch certain instructions repeatedly,resulting in more execution cycles of the processor,as illustrated inFig.12.Moreover,the number of access to conventional ICache and the increased execution cycles caused by D2MB-ICache are counted for each benchmark inFig.13.Obviously,the increase in execution cycles is not closely related to the number of non-sequential access.D2MB-ICache only repeatedly accesses certain branch target addresses,thereby minimizing the cost of execution cycles caused by address jump.In a word,the increase in the execution cycle is mainly affected by the actual changes in the request addresses.

    Fig.13.Number of access (Y axis on the left) to conventional ICache and increased execution (EX) cycles (Y axis on the right) caused by D2MB-ICache.

    4.4.Execution Time

    To evaluate the performance of D2MB-ICache,the processors with different ICache configurations execute ten benchmarks at their maximum frequencies (shown inFig.14).With the same cache capacity,the processor with D2MB-ICache takes less execution time than conventional ICache,and their execution time difference increases with the increasing capacity.Apart from qsort and dhrystone,the execution time of double-size D2MB-ICache is less than that of conventional ICache without capacity expansion.Compared with conventional ICache,D2MB-ICaches with 1× and 2× capacities exhibit an average execution time reduction by 10.3% and 3.8%,respectively,while the execution time of quadruple-size D2MB-ICache is approximately equal to that of conventional ICache.

    Fig.14.Normalized performance of the processors with different ICaches.

    5.Related Work and Comparison

    Atoofian[13]exploited the property of value similarity in the L1 data cache and the L2 caches to compress data in caches.Compared with conventional caches,the compressed caches improve performance by 10.1%on average,which is close to the performance improved by the double-sized caches.Due to the increased idle cache banks in the compressed caches,a power gating technique is possible to reduce the leakage power.As a result,these compressed caches decrease the total energy consumption by 8%.Rea and Atoofian[14]combined data prefetching and compression techniques to expand the logical capacity and mitigate the decompression latency in the compressed caches.They found that the compressed caches with a 4-kB last outcome (LO) and stride (S) prefetcher exhibit a 1.7% speedup over the compression-only caches.However,the performance improvement incurs the penalty of power and area.

    Chiuet al.[25]proposed cache resiliency techniques,line recycling (LR) and bit bypass (BB-S),to optimize the cache architecture.Instead of simply disabling the working bits in faulty cache lines,LR reuses them and decreases the capacity loss by 33%.Furthermore,LR saves 43% of the energy consumption with a 0.77% L2 area cost in 28 nm.BB-S uses flip-flops to minimize the overhead of error entries,which provides error protection for the tag arrays.

    Spin-transfer torque random access memory (STT-RAM) has been proposed as a promising replacement for SRAM in reducing the leakage power consumption and decreasing area overhead.Based on STT-RAM,Liet al.[18]and Kong[19]proposed novel architectures and effectively reduced energy consumption.However,STT-RAM has bad write latency,which degrades the performance of the whole processor.Chenget al.[20]proposed a locality-aware method with a hybrid SRAM and STT-RAM configurable architecture.With only 5%of the latency cost,the L1 cache has been improved by 15% to 20% in energy efficiency.As a summary,due to performance limitations,there is still a long way to replace SRAM with the NVM memory in the L1 cache.

    Different from previous related work,our proposed technique enables the memory banks in ICache operate at the divide-by-2 CPU frequency,which improves the physical capacity,performance,and energy efficiency of ICache.Compared with double-sized conventional caches,D2MB-ICaches increase performance by 3.8% on average,while compressed caches in [14] only increase 1.7%.Moreover,the power consumption in this work is also better optimized than that of [18] and [19].

    6.Conclusion

    This paper demonstrates a large-capacity and high-speed ICache architecture consisting of memory banks operating at the divide-by-2 frequency.The proposed D2MB-ICache can provide instructions to the pipeline at the same frequency with CPU by 1) dividing the data and tag memory banks according to a partition mechanism and 2) inverting the memory bank clock and allocating instructions to the adjacent memory banks in order.As for non-sequential access,a JR module was designed to repeat certain branch instructions and ensure that D2MB-ICache does not miss any jump requests.Compared with conventional ICache,D2MB-ICaches with 1× and 2× capacities increase the average maximum frequency by 14.6% and 6.8%,and decrease the average execution time by 10.3% and 3.8%,respectively.The performance of quadruple-size D2MB-ICache was close to that of conventional ICache without capacity expansion.Moreover,D2MB-ICaches with different capacities (16 kB,32 kB,64 kB,and 128 kB) reduced the energy consumption by 0.5%,16.1%,24.3%,and 24.8%,respectively.The proposed scheme shows a significant advantage in performance,when the programs have a low taken branch instruction ratio.Furthermore,since the growth rate of energy consumption of D2MB-ICache is significantly lower than that of conventional ICache,the proposed architecture has an advantage in large-capacity ICaches.

    Disclosures

    The authors declare no conflicts of interest.

    99热只有精品国产| 给我免费播放毛片高清在线观看| 国产99白浆流出| 久久香蕉激情| av欧美777| 岛国在线观看网站| 久久这里只有精品19| 黄色成人免费大全| 国产高清videossex| 夜夜夜夜夜久久久久| 老鸭窝网址在线观看| 久久精品夜夜夜夜夜久久蜜豆 | 久久久水蜜桃国产精品网| 91麻豆精品激情在线观看国产| 亚洲av美国av| 国产免费av片在线观看野外av| 久久天堂一区二区三区四区| 18禁国产床啪视频网站| 亚洲精品色激情综合| 久久香蕉国产精品| 国产亚洲欧美在线一区二区| 精品国产乱码久久久久久男人| 人成视频在线观看免费观看| 美女高潮喷水抽搐中文字幕| 99国产极品粉嫩在线观看| 1024香蕉在线观看| 淫秽高清视频在线观看| 69av精品久久久久久| 国产成人精品久久二区二区91| 国产亚洲av嫩草精品影院| aaaaa片日本免费| 免费在线观看黄色视频的| 欧美日韩乱码在线| 亚洲 国产 在线| 一本综合久久免费| 在线国产一区二区在线| 一本大道久久a久久精品| 久久久精品国产亚洲av高清涩受| 美女午夜性视频免费| 国产aⅴ精品一区二区三区波| 国产午夜精品久久久久久| 波多野结衣巨乳人妻| 男女之事视频高清在线观看| 亚洲专区国产一区二区| 青草久久国产| 97碰自拍视频| 亚洲av电影不卡..在线观看| 免费电影在线观看免费观看| 亚洲精品av麻豆狂野| 一夜夜www| 在线观看www视频免费| 三级毛片av免费| 男人舔女人下体高潮全视频| 成人18禁高潮啪啪吃奶动态图| 国产精品久久久久久精品电影 | av中文乱码字幕在线| 国产成人啪精品午夜网站| 国产欧美日韩精品亚洲av| 香蕉久久夜色| 欧美成人一区二区免费高清观看 | 一区二区日韩欧美中文字幕| 亚洲欧美精品综合久久99| 亚洲成人国产一区在线观看| 国产激情偷乱视频一区二区| 日本免费a在线| 国产精品98久久久久久宅男小说| 最近在线观看免费完整版| 午夜影院日韩av| 色尼玛亚洲综合影院| 国产午夜精品久久久久久| 长腿黑丝高跟| 亚洲欧洲精品一区二区精品久久久| 国产精品一区二区三区四区久久 | 成人永久免费在线观看视频| cao死你这个sao货| 精品久久久久久久末码| 久久久久久九九精品二区国产 | 国产精品 国内视频| 久久久久久九九精品二区国产 | 成人手机av| 午夜福利18| 欧美久久黑人一区二区| 俄罗斯特黄特色一大片| 中文亚洲av片在线观看爽| 不卡一级毛片| 法律面前人人平等表现在哪些方面| 波多野结衣高清作品| 国产精品乱码一区二三区的特点| 禁无遮挡网站| 精华霜和精华液先用哪个| 亚洲第一电影网av| 亚洲av电影在线进入| 一二三四在线观看免费中文在| 亚洲五月婷婷丁香| 97人妻精品一区二区三区麻豆 | 搞女人的毛片| 午夜精品久久久久久毛片777| 国产不卡一卡二| 中文字幕高清在线视频| 男女视频在线观看网站免费 | 一本综合久久免费| 国产熟女午夜一区二区三区| av欧美777| 香蕉久久夜色| 欧美人与性动交α欧美精品济南到| 午夜a级毛片| 女人爽到高潮嗷嗷叫在线视频| 禁无遮挡网站| av欧美777| 制服丝袜大香蕉在线| 日韩高清综合在线| 岛国视频午夜一区免费看| 丁香欧美五月| 韩国精品一区二区三区| 久久婷婷成人综合色麻豆| 国产高清视频在线播放一区| 男女那种视频在线观看| 亚洲精品中文字幕一二三四区| 午夜福利在线在线| 一区二区三区国产精品乱码| 高清毛片免费观看视频网站| avwww免费| 99国产精品一区二区三区| 久久性视频一级片| 国产蜜桃级精品一区二区三区| 亚洲自偷自拍图片 自拍| 午夜两性在线视频| 黑人巨大精品欧美一区二区mp4| 美女国产高潮福利片在线看| 亚洲av片天天在线观看| 久久热在线av| 日韩欧美在线二视频| 男女那种视频在线观看| 制服诱惑二区| 88av欧美| 亚洲电影在线观看av| 黑人欧美特级aaaaaa片| 一边摸一边抽搐一进一小说| 午夜a级毛片| 老熟妇乱子伦视频在线观看| 老司机福利观看| 免费女性裸体啪啪无遮挡网站| 免费高清在线观看日韩| 国产黄a三级三级三级人| 夜夜看夜夜爽夜夜摸| 成人18禁在线播放| 老鸭窝网址在线观看| 亚洲国产精品sss在线观看| 国产在线精品亚洲第一网站| 视频在线观看一区二区三区| 两个人免费观看高清视频| 女性被躁到高潮视频| 十分钟在线观看高清视频www| 成人免费观看视频高清| 精品电影一区二区在线| 啦啦啦韩国在线观看视频| 最近在线观看免费完整版| 亚洲熟妇中文字幕五十中出| 亚洲人成网站在线播放欧美日韩| 午夜成年电影在线免费观看| 日韩av在线大香蕉| 日韩欧美一区视频在线观看| 欧美不卡视频在线免费观看 | 18禁裸乳无遮挡免费网站照片 | or卡值多少钱| 女人高潮潮喷娇喘18禁视频| 精品国产国语对白av| 日韩一卡2卡3卡4卡2021年| 久久久精品欧美日韩精品| 韩国精品一区二区三区| netflix在线观看网站| 夜夜夜夜夜久久久久| 少妇裸体淫交视频免费看高清 | 熟女少妇亚洲综合色aaa.| 欧美 亚洲 国产 日韩一| 在线看三级毛片| 久久中文字幕一级| 国产一区在线观看成人免费| 白带黄色成豆腐渣| 国产av一区在线观看免费| 久久精品成人免费网站| 在线免费观看的www视频| 一区二区三区激情视频| 午夜免费鲁丝| 99久久国产精品久久久| 黑丝袜美女国产一区| 国产在线观看jvid| 亚洲欧美精品综合一区二区三区| 国产高清videossex| 制服诱惑二区| 午夜福利免费观看在线| 色在线成人网| av免费在线观看网站| 精品一区二区三区av网在线观看| 久久久国产成人免费| 淫秽高清视频在线观看| 日韩精品免费视频一区二区三区| 欧美性猛交╳xxx乱大交人| 人人妻人人澡人人看| 国产aⅴ精品一区二区三区波| 亚洲一卡2卡3卡4卡5卡精品中文| 国产黄a三级三级三级人| 2021天堂中文幕一二区在线观 | 亚洲国产欧洲综合997久久, | 久久精品91无色码中文字幕| 国产精品二区激情视频| 此物有八面人人有两片| 不卡av一区二区三区| 亚洲第一电影网av| 欧美人与性动交α欧美精品济南到| 亚洲av电影在线进入| 一a级毛片在线观看| 精品少妇一区二区三区视频日本电影| 亚洲国产中文字幕在线视频| 久久九九热精品免费| 亚洲性夜色夜夜综合| 国产精品久久久人人做人人爽| 亚洲 欧美 日韩 在线 免费| 色播在线永久视频| 看片在线看免费视频| 国产一级毛片七仙女欲春2 | 国产成年人精品一区二区| 真人一进一出gif抽搐免费| 夜夜夜夜夜久久久久| 免费看日本二区| 国产aⅴ精品一区二区三区波| 中文字幕久久专区| av电影中文网址| 黄色丝袜av网址大全| 亚洲成a人片在线一区二区| 高清毛片免费观看视频网站| 亚洲九九香蕉| 老汉色av国产亚洲站长工具| 搡老岳熟女国产| 色哟哟哟哟哟哟| 成人永久免费在线观看视频| 可以免费在线观看a视频的电影网站| 1024视频免费在线观看| 一二三四社区在线视频社区8| 欧美又色又爽又黄视频| 亚洲精品美女久久久久99蜜臀| 这个男人来自地球电影免费观看| 1024手机看黄色片| 黄色片一级片一级黄色片| 99久久精品国产亚洲精品| 精品一区二区三区四区五区乱码| 亚洲第一欧美日韩一区二区三区| 亚洲精品av麻豆狂野| 欧美久久黑人一区二区| 国产激情久久老熟女| 久99久视频精品免费| 99riav亚洲国产免费| 国产亚洲精品久久久久5区| АⅤ资源中文在线天堂| 久久婷婷人人爽人人干人人爱| 国产男靠女视频免费网站| 欧美三级亚洲精品| 老司机在亚洲福利影院| 精品国产国语对白av| 精品国产乱子伦一区二区三区| 精品久久久久久成人av| 国产国语露脸激情在线看| 18禁裸乳无遮挡免费网站照片 | 午夜免费激情av| 69av精品久久久久久| 黑丝袜美女国产一区| 一二三四社区在线视频社区8| 国产高清激情床上av| 久久人人精品亚洲av| 老熟妇仑乱视频hdxx| 亚洲五月色婷婷综合| 超碰成人久久| 精品国产国语对白av| 欧美最黄视频在线播放免费| svipshipincom国产片| 女人被狂操c到高潮| 人成视频在线观看免费观看| 一进一出好大好爽视频| 成人国产综合亚洲| 欧美乱码精品一区二区三区| 50天的宝宝边吃奶边哭怎么回事| 国内久久婷婷六月综合欲色啪| 老司机午夜福利在线观看视频| 一进一出抽搐动态| 欧美色视频一区免费| www.www免费av| 亚洲aⅴ乱码一区二区在线播放 | 男人舔女人下体高潮全视频| 国产亚洲精品第一综合不卡| 国内少妇人妻偷人精品xxx网站 | 国产精品二区激情视频| 欧美黑人精品巨大| 亚洲国产精品999在线| 国产一区二区三区视频了| 亚洲人成网站在线播放欧美日韩| 日韩精品中文字幕看吧| 久久久久免费精品人妻一区二区 | 无限看片的www在线观看| 精品卡一卡二卡四卡免费| 亚洲精品av麻豆狂野| 人妻丰满熟妇av一区二区三区| 一本精品99久久精品77| 国产久久久一区二区三区| 色综合欧美亚洲国产小说| 在线av久久热| e午夜精品久久久久久久| 在线看三级毛片| 国产真实乱freesex| 男人操女人黄网站| 两性夫妻黄色片| 桃红色精品国产亚洲av| 欧美三级亚洲精品| 色婷婷久久久亚洲欧美| 中文资源天堂在线| 亚洲国产欧美网| 久久精品国产综合久久久| 亚洲午夜理论影院| 中文资源天堂在线| 搡老熟女国产l中国老女人| 在线观看舔阴道视频| 精品卡一卡二卡四卡免费| 99久久99久久久精品蜜桃| 亚洲国产日韩欧美精品在线观看 | 97超级碰碰碰精品色视频在线观看| 久久这里只有精品19| 美女 人体艺术 gogo| 久久这里只有精品19| 两人在一起打扑克的视频| 亚洲一区高清亚洲精品| 久久草成人影院| 精品欧美一区二区三区在线| 久久久久国产一级毛片高清牌| 中出人妻视频一区二区| 精品日产1卡2卡| 熟女少妇亚洲综合色aaa.| √禁漫天堂资源中文www| 伦理电影免费视频| 久久国产精品人妻蜜桃| 91老司机精品| 国产激情偷乱视频一区二区| 色综合婷婷激情| 亚洲一区二区三区不卡视频| 老司机深夜福利视频在线观看| 日韩精品中文字幕看吧| 午夜激情av网站| 熟女电影av网| 黄色 视频免费看| 天堂√8在线中文| 国产区一区二久久| 国产亚洲精品综合一区在线观看 | 精品国产亚洲在线| 他把我摸到了高潮在线观看| 69av精品久久久久久| tocl精华| 国产亚洲欧美98| 丰满人妻熟妇乱又伦精品不卡| 亚洲欧美日韩高清在线视频| 亚洲av中文字字幕乱码综合 | 91麻豆精品激情在线观看国产| 长腿黑丝高跟| 免费观看人在逋| 日韩大码丰满熟妇| 国产成人av教育| 精品日产1卡2卡| 亚洲在线自拍视频| 91九色精品人成在线观看| 婷婷亚洲欧美| 18禁黄网站禁片免费观看直播| 亚洲九九香蕉| 一进一出抽搐动态| 亚洲欧洲精品一区二区精品久久久| 国产99白浆流出| 女人高潮潮喷娇喘18禁视频| 香蕉av资源在线| 亚洲狠狠婷婷综合久久图片| 99久久无色码亚洲精品果冻| 香蕉国产在线看| 首页视频小说图片口味搜索| 正在播放国产对白刺激| 精品高清国产在线一区| 亚洲午夜精品一区,二区,三区| 丝袜在线中文字幕| 成人三级黄色视频| 99热只有精品国产| 成在线人永久免费视频| 啪啪无遮挡十八禁网站| 国产欧美日韩一区二区精品| 久久久国产成人精品二区| 最近最新中文字幕大全电影3 | 亚洲一区中文字幕在线| 老司机靠b影院| 99re在线观看精品视频| 哪里可以看免费的av片| 两个人视频免费观看高清| 久9热在线精品视频| 老司机福利观看| 亚洲精品av麻豆狂野| 亚洲午夜精品一区,二区,三区| 日韩欧美 国产精品| www日本黄色视频网| 一二三四社区在线视频社区8| 18禁美女被吸乳视频| 欧美色视频一区免费| 校园春色视频在线观看| 真人做人爱边吃奶动态| 日本成人三级电影网站| 欧美日本亚洲视频在线播放| 岛国视频午夜一区免费看| 搡老妇女老女人老熟妇| 国产熟女午夜一区二区三区| 熟女少妇亚洲综合色aaa.| 久久国产亚洲av麻豆专区| 欧美不卡视频在线免费观看 | 国产精品国产高清国产av| 成人免费观看视频高清| 母亲3免费完整高清在线观看| 黄色片一级片一级黄色片| 国产高清videossex| 熟妇人妻久久中文字幕3abv| 丝袜在线中文字幕| 国产蜜桃级精品一区二区三区| 成在线人永久免费视频| 久久午夜亚洲精品久久| 欧美日韩亚洲国产一区二区在线观看| 真人做人爱边吃奶动态| 久久香蕉精品热| 日韩 欧美 亚洲 中文字幕| 欧美性猛交黑人性爽| 国产精华一区二区三区| 国产亚洲欧美精品永久| 精品国产国语对白av| 久久婷婷成人综合色麻豆| 国产免费av片在线观看野外av| 日韩欧美免费精品| 国产麻豆成人av免费视频| 观看免费一级毛片| 国产乱人伦免费视频| 久久人人精品亚洲av| 久久香蕉国产精品| 午夜精品在线福利| 日韩欧美国产在线观看| 欧美不卡视频在线免费观看 | 亚洲欧美精品综合一区二区三区| 亚洲美女黄片视频| 91成年电影在线观看| 最近在线观看免费完整版| 正在播放国产对白刺激| 欧美不卡视频在线免费观看 | 日本成人三级电影网站| 亚洲片人在线观看| 久久久精品欧美日韩精品| 视频在线观看一区二区三区| 精品午夜福利视频在线观看一区| 久久精品国产综合久久久| 伊人久久大香线蕉亚洲五| 十八禁网站免费在线| 又大又爽又粗| 在线播放国产精品三级| 国产精品国产高清国产av| 精华霜和精华液先用哪个| 欧美日韩亚洲综合一区二区三区_| 日本一区二区免费在线视频| 免费观看精品视频网站| 亚洲三区欧美一区| 18禁国产床啪视频网站| 一边摸一边抽搐一进一小说| 日本在线视频免费播放| 激情在线观看视频在线高清| 美女午夜性视频免费| 性色av乱码一区二区三区2| 最新在线观看一区二区三区| 在线观看舔阴道视频| 免费看a级黄色片| 成人免费观看视频高清| 18禁黄网站禁片午夜丰满| 久久国产亚洲av麻豆专区| 国产爱豆传媒在线观看 | 国产精品久久久久久人妻精品电影| 免费在线观看日本一区| 日韩 欧美 亚洲 中文字幕| av福利片在线| 老司机在亚洲福利影院| 琪琪午夜伦伦电影理论片6080| 午夜久久久在线观看| 亚洲 国产 在线| 国产欧美日韩一区二区精品| 真人做人爱边吃奶动态| 欧美日本亚洲视频在线播放| 日本免费a在线| 午夜福利成人在线免费观看| 手机成人av网站| 久久精品国产清高在天天线| 两性夫妻黄色片| 黄色毛片三级朝国网站| 国产av在哪里看| 日日干狠狠操夜夜爽| 欧美中文综合在线视频| 欧美不卡视频在线免费观看 | 天天躁夜夜躁狠狠躁躁| 天天添夜夜摸| 精品午夜福利视频在线观看一区| 老司机在亚洲福利影院| 久久精品国产清高在天天线| 十分钟在线观看高清视频www| 人人妻,人人澡人人爽秒播| 亚洲五月婷婷丁香| 国产精品国产高清国产av| 一个人免费在线观看的高清视频| 最近最新中文字幕大全电影3 | or卡值多少钱| 精品久久久久久久久久免费视频| av在线播放免费不卡| 女人被狂操c到高潮| 熟女电影av网| 国产精品爽爽va在线观看网站 | 91大片在线观看| 国产高清视频在线播放一区| 丝袜美腿诱惑在线| 亚洲国产精品久久男人天堂| 久久久久久亚洲精品国产蜜桃av| 757午夜福利合集在线观看| 精品国产美女av久久久久小说| 男人操女人黄网站| 中文资源天堂在线| 久久亚洲精品不卡| 后天国语完整版免费观看| 国产亚洲精品综合一区在线观看 | 美女高潮到喷水免费观看| 狂野欧美激情性xxxx| 欧美成人午夜精品| 成人免费观看视频高清| 特大巨黑吊av在线直播 | 91av网站免费观看| 制服人妻中文乱码| 欧美又色又爽又黄视频| 久久亚洲真实| 亚洲av中文字字幕乱码综合 | 国产熟女午夜一区二区三区| 一本综合久久免费| 香蕉av资源在线| aaaaa片日本免费| 变态另类成人亚洲欧美熟女| 美女高潮到喷水免费观看| 精品日产1卡2卡| 99久久久亚洲精品蜜臀av| 亚洲欧美一区二区三区黑人| 久久热在线av| 午夜福利18| 制服诱惑二区| 欧美精品啪啪一区二区三区| 亚洲熟妇中文字幕五十中出| 久久久国产成人精品二区| 一区二区三区激情视频| 欧美日韩亚洲综合一区二区三区_| 级片在线观看| 91成人精品电影| 午夜福利视频1000在线观看| 9191精品国产免费久久| 1024香蕉在线观看| 在线av久久热| 丝袜在线中文字幕| 亚洲一码二码三码区别大吗| 日日爽夜夜爽网站| 国产蜜桃级精品一区二区三区| 热99re8久久精品国产| 亚洲五月色婷婷综合| 国产精品av久久久久免费| 国产精品一区二区免费欧美| 好男人电影高清在线观看| 免费看十八禁软件| 久久精品亚洲精品国产色婷小说| 男女床上黄色一级片免费看| 中出人妻视频一区二区| 老熟妇仑乱视频hdxx| 1024视频免费在线观看| 国产伦一二天堂av在线观看| 中文字幕人妻丝袜一区二区| 特大巨黑吊av在线直播 | 老鸭窝网址在线观看| 韩国av一区二区三区四区| 亚洲成av片中文字幕在线观看| 亚洲欧美精品综合久久99| 伊人久久大香线蕉亚洲五| 高清毛片免费观看视频网站| 欧美国产精品va在线观看不卡| 国产亚洲欧美在线一区二区| 精品久久久久久久久久久久久 | 岛国视频午夜一区免费看| 成人三级做爰电影| 亚洲精品一区av在线观看| 久9热在线精品视频| 精品少妇一区二区三区视频日本电影| 少妇裸体淫交视频免费看高清 | 欧美日韩亚洲综合一区二区三区_| 夜夜躁狠狠躁天天躁| 色综合亚洲欧美另类图片| 婷婷丁香在线五月| 欧美又色又爽又黄视频| 久久久国产成人免费| 日韩中文字幕欧美一区二区| 欧美在线一区亚洲| 国产亚洲精品一区二区www| 啦啦啦 在线观看视频| 观看免费一级毛片| 国产精品久久久人人做人人爽| 亚洲第一欧美日韩一区二区三区| 韩国精品一区二区三区| 亚洲一卡2卡3卡4卡5卡精品中文| 少妇 在线观看| 国产野战对白在线观看| 日韩精品免费视频一区二区三区| 首页视频小说图片口味搜索| ponron亚洲| 真人一进一出gif抽搐免费| 国产乱人伦免费视频|