• <tr id="yyy80"></tr>
  • <sup id="yyy80"></sup>
  • <tfoot id="yyy80"><noscript id="yyy80"></noscript></tfoot>
  • 99热精品在线国产_美女午夜性视频免费_国产精品国产高清国产av_av欧美777_自拍偷自拍亚洲精品老妇_亚洲熟女精品中文字幕_www日本黄色视频网_国产精品野战在线观看 ?

    Characterizing the Relative Importance Assigned to Physical Variables by Climate Scientists when Assessing Atmospheric Climate Model Fidelity

    2018-06-29 08:24:26SusannahBURROWSAritraDASGUPTASarahREEHLLisaBRAMERPoLunMAPhilipRASCHandYunQIAN
    Advances in Atmospheric Sciences 2018年9期

    Susannah M.BURROWS,Aritra DASGUPTA,Sarah REEHL,Lisa BRAMER,Po-Lun MA,Philip J.RASCH,and Yun QIAN

    Paci fi c Northwest National Laboratory,Richland,Washington 99354,USA

    1.Introduction

    A critical aspect of any climate modeling research is an evaluation of the realism,or fi delity,of the model’s simulated climate through a careful comparison with observational data.For the purposes of this discussion,we de fi ne a climate model’s “ fi delity”broadly as the agreement of the simulated climate with the observed historical and presentday climate state,typically using a combination of satellite and ground-based observations, fi eld campaign measurements,and reanalysis data products as primary sources of observational data.At climate modeling centers around the world,the development of a new model version is always followed by a calibration(“tuning”)e ff ort aimed at selecting values for model parameters that are physically justi fiable and lead to a credible simulation of climate(Hourdin et al.,2017).Model tuning involves the completion of a large number of simulations with variations in parameters,input fi les,and other features of the model.Each simulation is painstakingly evaluated,typically by examining a set of priority metrics,accompanied by manual inspection of a variety of plots and visualizations of various modeled fi elds,and detailed comparisons to determine which model con fi guration produces a credible realization of the climate.Tuning one coupled climate model requires thousands of hours of e ff ort by skilled experts.Experts must exercise judgment,based on years of training,experience,and broad and deep understanding of the model,the physical climate system,and observational constraints,in determining which trade-o ff s are defensible when di ff erent optimization goals con fl ict.

    Comparisons of model fi delity across multiple model simulations are also carried out in multi-model intercomparison projects(e.g.,Gleckler et al.,2008;Reichler and Kim,2008),and in perturbed parameter ensemble experiments for the purpose of quantifying model uncertainty or sensitivities(Yang et al.,2013;Qian et al.,2015,2016).Such studies aim to understand what factors lead to inter-model diversity and drive model sensitivities and to identify potential improvements.Additionally,if an adequate single metric of overall climate model fi delity could be developed,it could be applied to construct weighted averages of climate simulation ensembles(Min and Hense,2006;Suckling and Smith,2013),and used in automatic parameter optimization algorithms(Zhang et al.,2015).

    Early e ff orts to characterize multi-variable climate model fi delitycalculatedanindexofclimatemodel fi delitybycalculating a normalized root-mean-square error or similar metric for each of a selected set of model variables,and then averaging these metrics for all variables(Gleckler et al.,2008;Reichler and Kim,2008).More nuanced objective methods have been proposed to account for the inherent variability in each fi eld(Bravermanetal.,2011),andforspatialandtemporal dependencies between variables(Nosedal-Sanchez et al.,2016).

    These objective methods characterize how closely models resemble observations of speci fi c variables with an increasing degree of sophistication.Nevertheless,in all such approaches,expert judgement is exercised in the selection of which variables to include.In addition,in most previous studies,an implicit decision was made to treat all variables as being of equal physical importance.By contrast,when experts evaluate model fi delity,their decision-making implicitly incorporates their understanding of the physical importance of speci fi c variables to the science questions they are interested in,and more emphasis is placed on the most physically relevant variables.Recent studies have emphasized that the selection of assessed variables should re fl ect physical understanding of the system under consideration(Knutti et al.,2017)and that di ff erent research teams may select di ff erent optimization criteria when weighting model ensemble members,depending on their goals(Herger et al.,2017).

    A potential path forward is to construct a fi delity index I that combines multiple metrics mithat characterize di ff erent aspects of model fi delity,weighted by their relative importance wi:

    However,since the relative “importance”of di ff erent optimization goals is inherently subjective,any such index,including one in which all wiare equal,will be susceptible to criticism that the weights chosen are arbitrary.

    Since expert judgement cannot be fully eliminated from the model evaluation process,we propose that it would be valuable to better understand and quantify the relative importance climate modelers assign to di ff erent aspects of model fi delity when making decisions about trade-o ff s.In addition,we believe it is important to quantify the degree to which consensus exists about the importance of such variables.In the longer term,we envision that this information can be used to develop metrics that quantify both the mean and the variability of the community’s judgements about climate model fi delity.

    This paper reports on our fi rst step towards this long-term goal:the establishment of a baseline understanding of the level of importance that experts explicitly state they assign to di ff erent variables when evaluating the mean climate state of the atmosphere of a climate model.To this end,we conducted a large international survey of climate model developers and users,and asked them to indicate their view of the relative importance of a subset of variables used in assessing model fi delity,in the context of particular scienti fi c goals.The speci fi c aims of this study are to:(1)quantify the extent of consensus among climate modelers on the relative importance of di ff erent variables in evaluating climate models;(2)document whether modelers adjust their importance weights depending on the scienti fi c purpose for which a model is being evaluated;(3)determine whether either importance rankings or degree of consensus vary as a function of an individual’s experience or domain of expertise;and(4)provide baseline information for a planned follow-up study,a mock model evaluation exercise.In the follow-up study,described in more detail in section 4,we will investigate whether experts’assessments of models,on the basis of plots and metrics describing model–observation comparisons,are consistent with the relative importance that these experts previously assigned to individual variables for the assessment of model fi delity,with respect to speci fi c science goals.

    We describe the present study in the following sections.Section 2 describes the design of the survey,recruitment of participants,and methods used in analyzing survey responses.Section 3 describes the results of the survey,including the distribution of importance rankings,degree of consensus,dependence of responses on the speci fi c science questionsandrespondents’levelofexperience,andperceived barriers to systematic quanti fi cation of climate model fi delity.Section 4 discusses a potential approach to synthesizing expert assessments of model fi delity and objective methods for fi delity assessment,by systematically measuring and explicitly accounting for the relative importance experts assign to di ff erent aspects of fi delity.Finally,section 5 summarizes the key points and conclusions from this study.

    2.Survey design and methods

    2.1.Survey aims,design and scope

    We conducted a large international survey to document and understand the expert judgments of the climate modeling community on the relative importance of di ff erent model variables in the evaluation of simulation fi delity.

    To keep the scope of this study focused,we only considered the evaluation of the annual mean climatology of an atmosphere-only model simulation,with prescribed SST.In addition,participants were asked to assume that their evaluation would be carried out only on the basis of scalar metrics(e.g.,RMSE,correlation)characterizing the agreement of the respective model fi eld with observations.

    Transient features of climate were intentionally excluded from this study,but are of critical importance in model evaluation,and should be explored in future work.Similarly,coupled climate models have more complex tuning criteria that are not considered here.

    We chose to limit the number of variables and criteria under consideration in order to encourage broader participation,and in anticipation of a planned follow-up study(described in more detail in section 4).Brie fl y,the follow-up study will invite experts to compare and evaluate climate model outputs,and will aim to infer the importance that expertsimplicitly assign to di ff erent aspects of model fi delity in conducting this assessment.To the best of our knowledge,this would be the fi rstattempttoexperimentallycharacterizeexpertevaluations of climate model fi delity,and so we aim to initially test the approach using a small number of key variables,which will allow for a more controlled study.The relative importance ratings and other input from experts reported in this study will both inform the design of the follow-up study and provide a priori values for Bayesian inference of the weights wi.

    The importance of a particular variable in model evaluation will depend on the purpose for which the model will be used.To better constrain the responses,as well as to explore how expert rankings of di ff erent model variables might change depending on the scienti fi c objectives,we asked participants to rate the importance of di ff erent variables with respect to several di ff erent“Science Drivers”.A list of the six Science Drivers used in this survey is shown in Table 1.For each Science Driver,participants were presented with a preselected list of variables thought to be relevant to that topic,and asked to rate the importance of each variable on a sevenpoint Likert scale from “Not at all Important”to “Extremely Important”.Participants were also invited to provide written feedback identifying any “very important”or“extremelyimportant” variables that they felt had been overlooked;many took the opportunity to provide these comments,summarized in Tables S1–S3(see Electronic Supplementary Material).This feedback will be used to improve the survey design in the follow-up study.

    Table 1.Science Driver(SD)questions posed in this survey.

    2.2. Survey recruitment,participation,and data screening

    The survey was distributed via several professional mailing lists targeting communities of climate scientists,especially model developers and users,and by directly soliciting input from colleagues through the professional networks of the authors of this paper.Due to privacy restrictions,we are unable to report the identities or geographic locations of survey respondents,but we are con fi dent that they are representative of the climate modeling community.The survey was open from 18 January 2017 to 25 April 2017.Participants who had not completed at least all items on the fi rst Science Driver(N=12),and participants who rated themselves as “not at all experienced”with evaluating model fidelity(N=7)were excluded from analysis.Of the remaining 96 participants,81 had completed all six Science Drivers.

    Our survey respondents were a highly experienced group,with the vast majority of participants rating themselves as either“very familiar”(40.6%)or“extremely familiar”(40.6%)with climate modeling.In addition,a large fraction of our participants had worked in climate modeling for many years,with the majority of participants(62)reporting at least 10 years’experience,and a substantial number of participants(31)reporting at least 20 years’experience with climate modeling.When asked to rate their experience in“evaluating the fi delity of the atmospheric component of global climate model simulations,”37.5%rated themselves as “very experienced,”and 20.8%as “moderately experienced”in “tuning/calibrating the atmospheric component of global climate model simulations”.An overview of the characteristics of the survey participants is shown in Fig.1.

    2.3.Formal consensus measure:Coefficient of Agreement(A)

    To quantify the degree of consensus among our participants,we employ a formal measure of consensus called the coefficient of agreement A(Ri ff enburgh and Johnstone,2009),which varies from values near 0(no agreement;random responses)to a maximum possible value of 1(complete consensus).Calculated values of A for the two experience groups,and their probability p of being signi fi cantly di ff erent from each other,are tabulated for all Science Drivers and variables in the Supplementary Tables S4–S6.

    The coefficient of agreement is calculated from the observed disagreement dobsand the expected disagreement under the null hypothesis of random responses dexp.Let rmaxdenote the number of possible options(7 in the Likert scale used here);let r=1...rmaxdenote the possible responses(r=7 is “Extremely important”,r=6 is “Very important”,and so on);let nrdenote the number of respondents choosing the rth option,and let rmeddenote the median value of r from all respondents.The observed disagreement is then calculated as

    where|rmed?r|is the weight for the rth choice.The expected disagreement is calculated as

    The coefficient of agreement A is then calculated as the complement of the ratio of observed to expected disagreement:

    For randomly distributed responses,dobswould be close to dexp,and A would be close to zero;while for perfect agreement,dobs=0 and A=1.

    Fig.1.Characteristics of survey participants.

    Because the value of A is sensitive to the total number of respondents N,the value of A is not comparable for subgroups of participants with di ff erent sizes.We performed additional signi fi cance testing to determine whether the degree of consensus was the same,or di ff erent,between our“high experience”and “l(fā)ow experience”groups,and/or between two survey drivers.

    We test for statistically signi fi cant di ff erences between two values of the coefficient of agreement for two groups of responses,A1and A2,by performing a randomization test with the null hypothesis H0:A1=A2.To perform this test,we take l=1:100 random draws,without replacement,from the two groups of survey responses.For each lth draw,we calculate the di ff erence in the coefficient of agreement for the two groups,dl=|A1l?A2l|.We then calculate the p-value for rejection of the null hypothesis,i.e.,the probability that a di ff erence in agreement larger than the observed mean could occur by chance:

    where dl,meanis the mean of all dl.

    3.Survey results and discussion

    Here we report on selected analyses and results from the survey.We focus primarily on:(1)the degree of consensus among experts on the importance of di ff erent model variables;(2)how responsive experts’assessments of variable importance are to the de fi ned scienti fi c objectives;and(3)di ff erences in expert ratings of variable importance between respondents with more climate modeling experience and those with less experience.

    We also performed similar analyses comparing survey responses from model users and model developers.The responses of these two groups were statistically nearly identical,and so we do not report them in further detail.

    3.1.Importance of di ff erent variables to climate model fidelity assessments across six Science Drivers

    In this section,we discuss expert ratings of variable importance for the six science drivers.In order to understand whether participants’responses di ff ered depending on their degree of expertise,we fi rst divided the participants into two experience groups:those who rated themselves as“very experienced”in evaluating model fi delity were placed into the“high experience”group(N=36);all other participants were placed into the “l(fā)ow experience”group(N=60).

    We emphasize that our“l(fā)ow experience”group consists largely of working climate scientists over the age of 30(95%),with a median of 10 years of experience in climate modeling.Inotherwords,our“l(fā)owexperience”groupmostly consists not of laypersons,students or trainees,but of earlyto-mid-career climate scientists with moderate levels of experience in evaluating and tuning climate models.Our“high experience”group consists largely of mid-to-late career scientists:the majority are over the age of 50(53%),with a median of 20.5 years of experience in climate modeling.Researchers on the development of expertise have argued that roughly 10 years of experience are needed for the develop-ment and maturation of expertise(Ericsson,1996);86%of our“high experience”group members have 10 years or more of climate modeling experience.

    3.1.1. Science Driver 1:How well does the model reproduce the overall features of the Earth’s climate?

    Our fi rst Science Driver asked respondents to assess the importance of di ff erent variables to“the overall features of Earth’s climate”.We believe that this statement summarizes the primary aim of most experts when calibrating a climate model.However,experts’typical practices are likely to be in fl uenced by factors such as the tools and practices used by their mentors and immediate colleagues,their disciplinary background,and their research interests.Such factors could contribute to di ff erences in judgments of what constitutes a“good”modelsimulation.TheaimofthisScienceDriveristo understand what experts prioritize when the goal is relatively imprecisely de fi ned as optimizing the “overall features”of climate;these responses can then be contrasted with the more speci fi c questions in the following fi ve Science Drivers.

    Figures 2 and 3 show the distribution of responses for each variable in Science Driver 1 for the high and low experience groups.Figure 4(top)summarizes the mean and standard deviation of importance ratings for all variables in Science Driver 1.Overall,the variables most likely to be identi fi ed as “extremely important”were(in ranked order):rain fl ux(N=31),2-m air temperature(N=28),longwave cloud forcing(N=22),shortwave cloud forcing(N=21),and sea level pressure(N=20).The complete distributions of responses for all science drivers by experience group,together with statistical summary variables and signi fi cance tests,are shown in Tables S1–13.

    Fig.2.Science Driver 1:distributions of importance ratings,ranked by consensus,as quanti fi ed by the coefficient of agreement A,for variables with high expert consensus about their importance.

    The distribution and degree of consensus is similar between the two groups,with no statistically signi fi cant differences for any variable(see Supplementary Tables S4–S6).This suggests that once an initial level of experience is acquired,additional experience may not lead to signi fi cant differences in judgments about model fi delity.

    Fig.3.As in Fig.2 but for variables with low expert consensus about their importance.

    It is instructive to examine which variables are the exceptions to this general rule;these exceptions hint at insights into where and how greater experience matters most in informing the judgments experts make about model fi delity.The distribution of responses of the high experience and low experience group di ff ered for only one item in Science Driver 1—the oceanic surface wind stress(p<0.01);for this variable,the median response of the high and low experience groups was“very important”and “moderately important,”respectively.We speculate that the high-experience group may be more sensitive to this variable due to(1)its critical importance to ocean–atmosphere coupling,and(2)awareness of the relatively high-quality observational constraints available from wind scatterometer data.

    We also investigated the degree of consensus on the importance of di ff erent variables.We observe a clearly higher degree of consensus for some variables,compared to others.Across all participants(high and low experience groups together),there is a comparatively high degree of consensus on the importance of shortwave cloud forcing(A=0.67),longwave cloud forcing(A=0.62),and rain fl ux(A=0.62).In particular,there is comparatively little agreement on the importance of oceanic surface wind stress(A=0.39),due to the discrepancy between experience groups on this item,and on the aerosol optical depth(AOD;A=0.42).The data we collected do not allow us to be certain of the reasoning behind importance ratings,but the lack of consensus on AOD importance is perhaps unsurprising in light of the high uncertainty associated with the magnitude of aerosol impacts on climate(Stocker et al.,2013),and recent controversies among climate modelers on the importance of aerosols to climate,or lack thereof(Booth et al.,2012;Stevens,2013;Seinfeld et al.,2016).

    3.1.2.Science Driver 2:How well does the model reproduce features of the global water cycle?

    Our second Science Driver included a comparatively limited number of variables related to the global water cycle(Fig.4:middle).These should be considered in combination with Science Driver 6,which addresses the assessment of simulated clouds using a satellite simulator(Fig.5).

    Fig.4.Science Drivers 1–3:mean responses,high and low experience groups,ranked by overall mean response from all participants;color of dots indicates standard deviation of responses.

    While the di ff erences did not pass our criteria for statistical signi fi cance,we note a slight tendency for the high experience group to assign higher mean importance ratings to net TOA radiative fl uxes and precipitable water amount.We speculate that this might be due to a slightly greater awareness of,and sensitivity to,observational uncertainties among the high experience group,expressed as a higher importance rating for variables with stronger observational constraints from satellite measurements.This interpretation is supported bythecommentofonestudyparticipant(with20years’experience in climate modeling),who observed that“surface LH[latent heating]and SH[sensible heating]are not well constrained from obs[ervations].While important,that means they aren’t much use for tuning.”

    3.1.3. Science Driver 3:How well does the model simulate Southern Ocean climate?

    For Southern Ocean climate,surface interactions that affectocean–atmospherecoupling,includingwindstress,latent heat fl ux(evaporation)and rain fl ux,together with shortwave cloud forcing,were identi fi ed as among the most important variables by our participants(Fig.4:bottom).

    Fig.5.Science Drivers 4–5:mean responses,high and low experience groups,ranked by overall mean response from all participants;color of dots indicates standard deviation of responses.

    The high experience group rated rain fl uxes as more important(median:“very”important)compared to the low experience group(median:“moderately”important;probability of di ff erence:p=0.02).

    It is interesting to compare the responses with Science Driver 1,which included many of the same variables.For instance,for AOD,the low experience group assigned a lower mean importance for overall climate(mean:4.32;σ:1.41)than for Southern Ocean climate(mean:4.04;σ:1.49);the high experience group assigned a higher mean importance for overall climate(mean:4.64;σ:1.16)than for Southern Ocean climate(mean:4.34;σ:1.13).

    The reasons for this discrepancy are unclear.One possibility is that the high experience group may be more aware that over the Southern Ocean,AOD provides a poor constraint on cloud condensation nuclei(Stier,2016),and is affected by substantial observational uncertainties,with estimates varying widely between di ff erent satellite products.

    3.1.4. Science Driver 4:How well does the model simulate important features of the water cycle in the Amazon watershed?

    On Science Driver 4,which addresses the water cycle in the Amazon watershed(Fig.5:top),participants identi fi ed surface sensible and latent heat fl ux,speci fi c humidity,and rain fl ux as the most important variables for evaluation.It is possible that the more experienced group is more sensitive to the critical role of land–atmosphere coupling in the Amazonianwatercycle.Thisinterpretationwouldbeconsistentwith the additional variables suggested by our survey participants for this science driver,which also focused on variables critical to land–atmosphere coupling,e.g. “soil moisture”,“water recycling ratio”,and “plant transpiration”(Supplementary TableS2).Whilethevariablesselectedforthesurveyfocused largely on mean thermodynamic variables,commenters also mentioned critical features of local dynamics in the Amazon region,such as surface topography and“wind fl ow over the Andes”,“convection”,and vertical velocity at 850 hPa.

    3.1.5.Science Driver 5:How well does the model simulate important features of the water cycle in the Asian watershed?

    For Science Driver 5,focused on the Asian watershed,participants rated rain fl ux,surface latent heat fl ux,and net shortwave radiative fl ux at the surface as the most important variables(Fig.5:bottom).For variables included in both Science Drivers,the order of variable importance was the same as in the Amazon watershed,but di ff erent than in the Southern Ocean;some of these di ff erences will be discussed in section 3.3.Written responses again mentioned soil moisture(3×)and moisture advection(2×)as important variables missing from the list.

    3.1.6.Science Driver 6:How well does the model simulate the climate impact of clouds globally?

    The fi nal Science Driver addressed the evaluation of cloud properties in the model(Fig.6)using a satellite simulator,which produces simulated satellite observations and retrievals based on radiative transfer calculations in the model.“Very important”(6)was the most common response for all variables in Science Driver 6(Supplementary Table S15).

    While di ff erences in responses between the two experience groups did not pass our bar for statistical signi ficance,the high experience group selected“extremely important”more frequently than the low experience group for the“high level cloud cover”and “l(fā)ow cloud cover”items,which also had the highest mean importance ratings in this Science Driver.

    Fig.6.Science Driver 6:mean responses,high and low experience groups,ranked by overall mean response from all participants;color of dots indicates standard deviation of responses.

    Five participants indicated that longwave cloud forcing and shortwave cloud forcing should have been included,and one respondent noted“A complete vertical distribution of cloud properties would be even more interesting than “l(fā)ow”,“medium”and “high”cloud cover.Cloud particle size and number would also be interesting.”Another responded that“cloud fraction is a model convenience but is quite arbitrary.”

    3.2.Impactofexperienceonjudgmentsofvariableimportance

    We hypothesized that:(H1)respondents with less experience in climate modeling would di ff er from more experienced respondents in their judgments of relative variable importance;and(H2)Respondents with greater experience in climate modeling would exhibit greater consensus in their judgments of the importance of di ff erent variables.

    (H1):Using a Chi-squared signi fi cance test(details in the Supplementary Material),we fi nd support for di ff erences in assessment of variable importance by high and low experience groups,but only for certain selected variables.Compared to the low experience group,the high experience group rated ocean surface wind stress as more important to evaluation of global climate(Science Driver1)and rain fl uxas more important to evaluation of Southern Ocean climate(Science Driver 3).

    Some other di ff erences are observable between the two groups(see Supplementary Tables S10–S15),but did not meet our criteria for signi fi cance;it is possible that additional di ff erences would emerge if a larger survey population could be attained.

    (H2):We fi nd no statistically signi fi cant di ff erences in degree of consensus between the high and low experience groups.

    The lack of large di ff erences in responses between the high and low experience groups suggests that variations in importance ratings are mainly driven by factors that are unrelated to the amount of experience the scientists have.Examples could include the speci fi c subdiscipline of the individual expert,or the practices and research foci that are common in their particular research community or geographic area.This result also suggests that expertise in climate model evaluation may reach a plateau after a certain level of pro fi ciency is attained,with additional experience leading to only incremental changes in expert evaluations and judgments.One possible reason for this is that the process of model evaluation is constantly evolving as updated model versions incorporate additional processes and improvements,new observational datasets become available,and new tools are developed to support the evaluation process.As a result,climate scientists continually need to update their understanding about climate models and their evaluation to re fl ect the current state-of-theart.Another possible explanation is that the culture of the climate modeling community may promote an efficient transfer of knowledge,as more experienced scientists o ff er training and advice to less experienced colleagues and to other research groups,shortening the learning curve of new scientists entering the fi eld.

    3.3.Impact of Science Drivers on judgments of variable importance

    We expected that survey participants would rate the importance of the same model variables di ff erently depending on the science goals,and indeed this is what we found.In this section,we focus on the ratings from the high experience group,but results from the low experience group are similar.

    For instance,rain fl ux was rated as less important to evaluation of the Southern Ocean(mean:6.00;σ:1.12)than to global climate(mean:6.14;σ:0.92)or the Asian watershed(mean:6.32;σ:1.00),while shortwave and longwave cloud forcing were rated as less important to the Asian watershed(shortwave:mean:5.48;σ:0.84;longwave:mean:5.23;σ:1.01)than to global climate(shortwave:mean:5.89;σ:1.02;longwave:mean:5.78;σ:1.02)or Southern Ocean climate(shortwave:mean:5.63;σ:0.86;longwave:mean:5.56;σ:0.90).Surface wind stress was rated more important in the Southern Ocean(mean:5.84;σ:1.30),and less important in the Asian watershed(mean:5.10;σ:1.33),compared to its importance to global climate evaluation(mean:5.81;σ:1.02).While total cloud liquid water path was rated as equally important in the Southern Ocean(mean:5.09;σ:1.10),Amazon watershed(mean:5.06;σ:1.29),and Asian watershed(mean:5.13;σ:1.13),total cloud ice water path was rated as less important to the evaluation of the model in the Amazon watershed(mean:4.45;σ:1.52)and Asian watershed(mean:4.74;σ:1.22),compared to the Southern Ocean(mean:5.03;σ:1.13).

    These di ff erences indicate that experts adjust the importance assigned to di ff erent metrics depending on the science question or region they are focusing on.As a result,we recommend that future work focused on understanding or quantifying expert judgments of model fi delity should always be explicit about the scienti fi c goals for which the model under assessment will be evaluated.

    3.4. Perceived barriers to systematic quanti fi cation of model fi delity

    We also explored the community’s perceptions about the current obstacles to systematic quanti fi cation of model fidelity(Fig.7).Survey participants identi fi ed the lack of robust statistical metrics(28%)and lack of analysis tools(10%)as major barriers,with 17%selecting “all of the above”.

    Fig.7.Perceived barriers to systematic quanti fi cation of model fi delity.Answers were selected from a predetermined list in response to the prompt:“Which one among the following,do you feel,is the biggest barrier towards systematic quanti fi cation of model fi delity?”

    Many participants selected the option “Other”and contributed written comments.We grouped these into qualitative categories of responses.The most commonly identi fi ed issues related to:

    ?Lacking or inadequate observational constraints and error estimates for observations(8×);

    ?Laboriousness of the tuning process(7×);and

    ?Challenges associated with identifying an appropriate single metric of model fi delity(7×).

    On the fi nal point,many of the comments focused on the risk of oversimplifying the analysis and evaluation of models:“Focusing on single metrics over simpli fi es the analysis too much to be useful.It is often hard to identify good vs.bad becauseoneaspectworkswhileothersdon’t,anddi ff erentmodels have di ff erent trade o ff s.”“No one metric tells the whole story;this may lead to false con fi dence in model fi delity.”Another commenter noted that“it’s very hard to create a single metric that accurately encapsulates subjective judgments of many scientists.”Finally,several respondents noted other barriers,including a perceived lack of sufficient expertise in the community,a perception that some widespread practices are inadequate or inappropriate for model evaluation,and a lack of sufficient attention to model sensitivities,as opposed to calibration with respect to present-day mean climate.

    4.Prospects for synthesizing expert assessments and objective model fi delity metrics

    As discussed in section 1,there are many potential applications for a climate model index that summarizes the model’s fi delity with respect to a particular science goal.However,one challenge is that an assessment of which models most resemble the observations depends in part on which observed variables are evaluated,and how much relative importance is assigned to each of them.A model fi delity index can be conceptualized as a weighted average of di ff erent objective metrics(Eq.1),but di ff erent experts might reasonably make di ff erent choices in assigning values to the weights,resulting in models potentially being ranked di ff erently by different experts,as illustrated in Fig.8.Furthermore,the information that experts implicitly use and the relative importance they assign to di ff erent aspects of the model’s fi delity when evaluating actual model output,likely di ff ers from their explicit statements about evaluation criteria.A systematic approach is needed to understand which information experts actually use in evaluating models,how much consensus exists among experts about variable importance when evaluating real model output,and how sensitive a proposed model fi delity index would be to di ff erences in these judgments between experts.

    Fig.8.Illustration of the concept of overall model fi delity rankings and their sensitivity to expert weights.Consider the pair of models uq1 and uq2,where the overall fi delity of the model is evaluated as a weighted mean of several component scores.If uq1 performs better than uq2 on some component scores,but worse on others,the ranking of these models according to their overall mean fi delity metric will be sensitive to how strongly each component metric is weighted.In this example,the rankings of several models using “naive weights”(unweighted average)are compared to rankings that use importance weights derived from the responses of two di ff erent experts in our survey.

    The survey described in this paper represents a fi rst step towards building that understanding.It also provides baseline information that will inform and be used in analysis of a second planned study,in which experts will be invited to evaluate the output from real model simulations.This mock model assessment exercise will enable us to address additional questions,such as:(1)How much consensus exists among experts when evaluating the fi delity of actual model simulations(as opposed to assessing variable importance in the abstract)?(2)can an index Iinferredbe constructed by using experts’assessments of real model output to infer the weights wi,inferredthat they implicitly assign to fi delity of different model variables?(3)Do the weights wi,inferredthat are inferred from experts’assessments of real model output agree or disagree with the relative importance that experts assigned to di ff erent variables a priori,as reported in this study?

    5.Summary and conclusions

    In this article we report results from a large community survey on the relative importance of di ff erent variables in evaluating a climate model’s fi delity with respect to a particular science goal.We plan to use the results of this study to inform the development of a follow-up study in which experts are invited to evaluate actual model outputs.

    We show that experts’rankings are sensitive to the scienti fi c objectives.For instance,surface wind stress was rated as among the most important variables in evaluation of Southern Oceanclimate,andamongtheleastimportantinevaluationof the Asian watershed.This suggests the possibility and utility of designing di ff erent and unique collections of metrics,tailored to speci fi c science questions and objectives,while accounting explicitly for uncertainty in variable importance.

    We fi nd no statistically signi fi cant di ff erences between rankingsprovidedbymodeldevelopersandmodelusers,suggesting some consistency between the developer and user communities’understanding of appropriate evaluation criteria.We also fi nd that our“high experience”group,consisting mostly of senior scientists with many years of climate modeling experience,and our “l(fā)ow experience”group,consisting mostly of early and mid-career scientists,were in agreement about the importance of most variables for model evaluation.However,within each group,there are also substantial disagreements and diversity in responses.The level of consensus is particularly low for AOD,which some participants rated as“extremely important”and others rated as“not at all important.”Additionally,in our survey sample,greater experience with evaluating model fi delity was not associated with greater consensus about the importance of di ff erent variables in model evaluation,and led to only minor changes in estimates of variable importance,i.e.,to small changes in the frequency distribution of importance ratings,which are only statistically signi fi cant for a small number of variables.

    Itisimportanttonotethatwhenexperts’responsesonthis survey di ff er,it does not necessarily imply that their evaluations of actual climate models would also di ff er.We anticipate that experts perform actual model evaluations in a more holistic manner and draw on much broader information than was included in this survey.In order to make initial progress on this extremely complex topic,we limited the scope of the study to evaluation of global mean climate,but the timedependent behavior of the system is also critical to assess,as well as features of the coupled climate system.Future research should extend this approach to include evaluation of diurnal and seasonal cycles;multi-year modes of climate variability such as ENSO,QBO,and PDO;extreme weather events;frequency of extreme precipitation;and other timedependent features of the climate system.Other,more complex metrics of model fi delity could also be considered,e.g.,object-based veri fi cation approaches,and scale-aware metrics that would be robust to changes in model resolution.

    Several study participants noted that issues related to observational datasets continue to be a major challenge for model evaluation.This includes logistical issues,such as theiravailabilitythroughacentralizedrepository,instandardized formats,and in updated versions as new data become available.However,more fundamentally,the limitations of observational constraints continue to be a major obstacle,including the lack of observations of certain key model variables,and the lack of estimates of the observational uncertainty for many datasets.Climate model evaluation e ff orts could also bene fi t from the increased adoption of metrics and diagnostic visualizations that directly incorporate information on observational uncertainty and natural variability,providing greater transparency and richer contextual information to users of these tools.

    The labor-intensiveness of model evaluation e ff orts was noted by several survey participants,and is well-known to most scientists familiar with climate model development.Climate modeling centers invest an enormous amount of computational and human resources into model tuning.At a rough estimate,tuning a coupled climate model requires the e ff orts of about fi ve full-time equivalent(FTE)scientists and engineers for each major model component(atmosphere,ocean and sea-ice,and land)as well as fi ve FTEs for the overall software engineering and tuning of the coupled system.An intense tuning e ff ort for a new major version of a coupled climate model may last for about one year and be repeated every fi ve years,for an average investment of four FTEs per year.Globally,there are at least 26 major climate modeling centers(the number that participated in CMIP5 project:http://cmip-pcmdi.llnl.gov/cmip5/availability.html),of which fi ve are located in the United States(DOE–ACME,NASA–GISS,NASA–GMAO,NCAR,NOAA–GFDL).Assuming that the typical cost to support a sta ffscientist at a climate modeling center is about$300 thousand per year(including salary,fringe,and overhead expenses),we estimate that the amount of money spent annually on the human e ff ort involved in climate model tuning is roughly$6 million in the United States and$31.2 million globally.

    If appropriate quantitative metrics can be developed that meaningfully capture the criteria important in a comprehensive model assessment,then algorithms could be applied to partially automate the calibration process,for instance by identifying an initial subset of model con fi gurations that produce plausible climates,subject to further manual inspection by teams of experts.Further work is needed to assess the feasibility of such an approach;but if successful,similar approaches could be valuable in the development not only of global climate models,but also of regional weather models,large eddy simulations,and other geophysical and complex computational models in which multiple aspects of fi delity must be assessed and weighed against each other.

    We suggest that a closer integration of objectively computed metrics with expert understanding of their relative importance has the potential to dramatically improve the efficiency of the model calibration process.The concise variable lists and community ratings reported in this study provide a snapshot of current expert understanding of the relative importance of certain aspects of climate model behavior to their evaluation.This information will be informative to the broader climate research community,and can serve as a starting point for the development of more sophisticated evaluation and scoring criteria for global climate models,with respect to speci fi c scienti fi c objectives.

    Acknowledgements. The authors would like to express their sincere gratitude to everyone who participated in the survey described in this paper.While privacy restrictions prevent us from publishing their identities,we greatly appreciate the time that many busy individuals have taken,voluntarily,to contribute to this research.We would like to thank Hui WAN,Ben KRAVITZ,Hansi SINGH,and Benjamin WAGMAN for helpful comments and discussions that helped to inform this work.This research was conducted under the Laboratory Directed Research and Development Program at PNNL,a multi-program national laboratory operated by Battelle for the U.S.Department of Energy under Contract DEAC05-76RL01830.

    Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use,distribution,and reproduction in any medium,provided the original author(s)and the source are credited.

    Electronic supplementary material Supplementary material is available in the online version of this article at https://doi.org/10.1007/s00376-018-7300-x.

    REFERENCES

    Booth,B.B.B.,N.J.Dunstone,P.R.Halloran,T.Andrews,and N.Bellouin,2012:Aerosols implicated as a prime driver of twentieth-century North Atlantic climate variability.Nature,484,228–232,https://doi.org/10.1038/nature10946.

    Braverman,A.,N.Cressie,and J.Teixeira,2011:A likelihoodbased comparison of temporal models for physical processes.Statistical Analysis and Data Mining:The ASA Data Science Journal,4,247–258,https://doi.org/10.1002/sam.10113.

    Ericsson,K.,1996:The Road to Expert Performance:Empirical Evidence from the Arts and Sciences,Sports,and Games.Lawrence Erlbaum Associates,369 pp.

    Gleckler,P.J.,K.E.Taylor,and C.Doutriaux,2008:Performance metrics for climate models.J.Geophys.Res.,113,D06104,https://doi.org/10.1029/2007JD008972.

    Herger,N.,G.Abramowitz,R.Knutti,O.Ang′elil,K.Lehmann,and B.M.Sanderson,2017:Selecting a climate model subset to optimise key ensemble properties.Earth System Dynamics,9,135–151,https://doi.org/10.5194/esd-9-135-2018.

    Hourdin,F.,and Coauthors,2017:The art and science of climate model tuning.Bull.Amer.Meteor.Soc.,98,589–602,https://doi.org/10.1175/BAMS-D-15-00135.1.

    Knutti,R.,J.Sedl′aˇcek,B.M.Sanderson,R.Lorenz,E.M.Fischer,and V.Eyring,2017:A climate model projection weighting scheme accounting for performance and interdependence.Geophys.Res.Lett.,44,1909–1918,https://doi.org/10.1002/2016GL072012.

    Min,S.K.,and A.Hense,2006:A Bayesian approach to climate model evaluation and multi-model averaging with an application to global mean surface temperatures from IPCC AR4 coupled climate models.Geophys.Res.Lett.,33,L08708,https://doi.org/10.1029/2006GL025779.

    Nosedal-Sanchez,A.,C.S.Jackson,and G.Huerta,2016:A new test statistic for climate models that includes fi eld and spatialdependenciesusingGaussianMarkovrandom fi elds.Geoscienti fi c Model Development,9,2407–2414,https://doi.org/10.5194/gmd-9-2407-2016.

    Qian,Y.,and Coauthors,2015:Parametric sensitivity analysis of precipitation at global and local scales in the Community Atmosphere Model CAM5.Journal of Advances in Modeling Earth Systems,7,382–411,https://doi.org/10.1002/2014 MS000354.

    Qian,Y.,and Coauthors,2016:Uncertainty quanti fi cation in climate modeling and projection.Bull.Amer.Meteor.Soc.,97,821–824,http://dx.doi.org/10.1175/BAMS-D-15-00297.1.

    Reichler,T.,and J.Kim,2008:How well do coupled models simulate today’s climate?Bull.Amer.Meteor.Soc.,89,303–311,https://doi.org/10.1175/BAMS-89-3-303.

    Ri ff enburgh,R.H.,and P.A.Johnstone,2009:Measuring agreement about ranked decision choices for a single subject.The International Journal of Biostatistics,5,https://doi.org/10.2202/1557-4679.1113.

    Seinfeld,J.H.,and Coauthors,2016:Improving our fundamental understanding of the role of aerosol-cloud interactions in the climate system.Proceedings of the National Academy of Sciences of the United States of America,113,5781–5790,https://doi.org/10.1073/pnas.151404311.

    Stevens,B.,2013:Aerosols:Uncertain then,irrelevant now.Nature,503,47–48,https://doi.org/10.1038/503047a.

    Stier,P.,2016:Limitations of passive remote sensing to constrain global cloud condensation nuclei.Atmospheric Chemistry and Physics,16,6595–6607,https://doi.org/10.5194/acp-16-6595-2016.

    Stocker,T.F.,and Coauthors,2013:Climate Change 2013:The Physical Science Basis.Contribution of Working group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change.Cambridge University Press,1535 pp,https://doi.org/10.1017/CBO9781107415324.

    Suckling,E.B.,and L.A.Smith,2013:An evaluation of decadal probability forecasts from state-of-the-art climate models.J.Climate,26,9334–9347,https://doi.org/10.1175/JCLI-D-12-00485.1.

    Yang,B.,and Coauthors,2013:Uncertainty quanti fi cation and parameter tuning in the CAM5 Zhang-McFarlane convection scheme and impact of improved convection on the global circulation and climate.J.Geophys.Res.,118,395–415,https://doi.org/10.1029/2012JD018213.

    Zhang,T.,L.Li,Y.Lin,W.Xue,F.Xie,H.Xu,and X.Huang,2015:An automatic and e ff ective parameter optimization method for model tuning.Geoscienti fi c Model Development,8,3579–3591,https://doi.org/10.5194/gmd-8-3579-2015.

    可以在线观看毛片的网站| 精品一区二区三区av网在线观看| 亚洲国产欧美网| 999精品在线视频| 国产在线观看jvid| 男女之事视频高清在线观看| 99精品久久久久人妻精品| 伦理电影免费视频| 久久香蕉精品热| 国产又爽黄色视频| 久久中文字幕一级| 国产97色在线日韩免费| 好看av亚洲va欧美ⅴa在| 悠悠久久av| 一区二区日韩欧美中文字幕| 免费看十八禁软件| 黄色丝袜av网址大全| 亚洲自偷自拍图片 自拍| www.自偷自拍.com| 久久久国产精品麻豆| 欧美色视频一区免费| 九色国产91popny在线| 超碰成人久久| ponron亚洲| 亚洲av美国av| 91国产中文字幕| 黄色丝袜av网址大全| 国产黄片美女视频| 少妇被粗大的猛进出69影院| 嫁个100分男人电影在线观看| 成人18禁在线播放| 黄色成人免费大全| 久久久久久久久久黄片| 最近在线观看免费完整版| 精品第一国产精品| 亚洲国产欧美日韩在线播放| 成人亚洲精品一区在线观看| 两人在一起打扑克的视频| 一区二区三区国产精品乱码| 91字幕亚洲| 亚洲精品国产精品久久久不卡| 午夜精品在线福利| 脱女人内裤的视频| 日韩欧美 国产精品| 亚洲熟妇中文字幕五十中出| 人人妻人人澡人人看| 日日爽夜夜爽网站| 日本五十路高清| 91成年电影在线观看| 欧美日韩一级在线毛片| 99久久99久久久精品蜜桃| 亚洲 国产 在线| 久久久久久久午夜电影| 久久这里只有精品19| 亚洲中文日韩欧美视频| 国产野战对白在线观看| 欧美zozozo另类| 成熟少妇高潮喷水视频| 少妇 在线观看| 久久99热这里只有精品18| 色播亚洲综合网| 国产激情久久老熟女| 欧美精品亚洲一区二区| 岛国视频午夜一区免费看| 嫁个100分男人电影在线观看| 制服人妻中文乱码| 国产一区二区三区视频了| 日本精品一区二区三区蜜桃| av福利片在线| 欧美日韩黄片免| 男女床上黄色一级片免费看| 桃红色精品国产亚洲av| 国产精品野战在线观看| 91成人精品电影| 午夜亚洲福利在线播放| 国产精品永久免费网站| 老司机靠b影院| 女生性感内裤真人,穿戴方法视频| 麻豆成人午夜福利视频| 亚洲av中文字字幕乱码综合 | 久久九九热精品免费| 欧美久久黑人一区二区| 91字幕亚洲| 黄色视频,在线免费观看| 欧美黄色淫秽网站| 国产精品99久久99久久久不卡| 婷婷六月久久综合丁香| 国产亚洲av嫩草精品影院| 午夜免费鲁丝| 午夜视频精品福利| 国产伦人伦偷精品视频| 一卡2卡三卡四卡精品乱码亚洲| 曰老女人黄片| 日本精品一区二区三区蜜桃| 国产成人av教育| 精品国产国语对白av| 最好的美女福利视频网| 国产伦人伦偷精品视频| www.精华液| 变态另类成人亚洲欧美熟女| 欧美激情极品国产一区二区三区| 国产成人欧美在线观看| 国产亚洲av高清不卡| 亚洲,欧美精品.| 香蕉国产在线看| 精品国产乱子伦一区二区三区| 老司机靠b影院| 久久久久久大精品| 精品午夜福利视频在线观看一区| 成人特级黄色片久久久久久久| 深夜精品福利| 操出白浆在线播放| 成人国产综合亚洲| 99精品欧美一区二区三区四区| 麻豆成人午夜福利视频| 欧美国产精品va在线观看不卡| 亚洲片人在线观看| 99精品久久久久人妻精品| 久久精品国产综合久久久| 国产精品影院久久| 18美女黄网站色大片免费观看| 午夜久久久久精精品| 精品一区二区三区视频在线观看免费| 免费高清在线观看日韩| 国产成人啪精品午夜网站| 亚洲精品一卡2卡三卡4卡5卡| 国产私拍福利视频在线观看| 亚洲aⅴ乱码一区二区在线播放 | 精品久久久久久成人av| 在线十欧美十亚洲十日本专区| 此物有八面人人有两片| 90打野战视频偷拍视频| tocl精华| 亚洲欧洲精品一区二区精品久久久| 国产视频一区二区在线看| 国产成年人精品一区二区| 国产精品美女特级片免费视频播放器 | 女人高潮潮喷娇喘18禁视频| 18美女黄网站色大片免费观看| 国内精品久久久久精免费| 午夜免费成人在线视频| 欧美成人性av电影在线观看| 久久草成人影院| 少妇 在线观看| 啦啦啦 在线观看视频| 桃色一区二区三区在线观看| 亚洲人成网站在线播放欧美日韩| av在线天堂中文字幕| 色av中文字幕| 欧美一区二区精品小视频在线| 精品国产乱码久久久久久男人| 欧洲精品卡2卡3卡4卡5卡区| 久久精品人妻少妇| 日本免费a在线| 丰满人妻熟妇乱又伦精品不卡| 亚洲精品国产区一区二| 国产一区二区三区在线臀色熟女| 91大片在线观看| 伊人久久大香线蕉亚洲五| 亚洲七黄色美女视频| 日韩欧美国产在线观看| 国产三级在线视频| a级毛片在线看网站| 欧美一级a爱片免费观看看 | 99国产精品一区二区蜜桃av| 国产成人av激情在线播放| 欧美又色又爽又黄视频| 久久性视频一级片| 女性被躁到高潮视频| 90打野战视频偷拍视频| 黄色丝袜av网址大全| 亚洲av成人一区二区三| 亚洲,欧美精品.| 美女高潮到喷水免费观看| av福利片在线| 一进一出抽搐动态| 黄片播放在线免费| 久久香蕉精品热| 人人妻人人澡欧美一区二区| 嫩草影院精品99| 国产视频内射| 国产亚洲精品一区二区www| 亚洲真实伦在线观看| 亚洲性夜色夜夜综合| 免费在线观看成人毛片| 精品国产国语对白av| 2021天堂中文幕一二区在线观 | 一本一本综合久久| 91老司机精品| 在线观看www视频免费| 中文字幕人妻丝袜一区二区| 禁无遮挡网站| √禁漫天堂资源中文www| 午夜a级毛片| 亚洲av美国av| 欧美一区二区精品小视频在线| 制服人妻中文乱码| 精品午夜福利视频在线观看一区| 成人18禁在线播放| 欧美日本视频| 久久久久久九九精品二区国产 | 高清毛片免费观看视频网站| 欧美一级毛片孕妇| 中文字幕av电影在线播放| 亚洲中文字幕日韩| 日韩欧美国产一区二区入口| 长腿黑丝高跟| a级毛片在线看网站| 国产精品久久电影中文字幕| 欧美乱色亚洲激情| 欧美av亚洲av综合av国产av| 制服诱惑二区| 欧美精品亚洲一区二区| 露出奶头的视频| 国产精品永久免费网站| 人人妻人人澡欧美一区二区| 制服诱惑二区| 在线观看66精品国产| 国产精品日韩av在线免费观看| 欧美成人免费av一区二区三区| 欧美亚洲日本最大视频资源| 亚洲黑人精品在线| 女人爽到高潮嗷嗷叫在线视频| 久久精品国产清高在天天线| 桃色一区二区三区在线观看| 国产爱豆传媒在线观看 | 首页视频小说图片口味搜索| 国产一级毛片七仙女欲春2 | 精品国产超薄肉色丝袜足j| 叶爱在线成人免费视频播放| 窝窝影院91人妻| av欧美777| 亚洲人成电影免费在线| 午夜两性在线视频| 大型av网站在线播放| 成人一区二区视频在线观看| 久久精品国产亚洲av香蕉五月| 无人区码免费观看不卡| 色哟哟哟哟哟哟| 亚洲七黄色美女视频| 免费观看精品视频网站| 宅男免费午夜| 成人三级做爰电影| 2021天堂中文幕一二区在线观 | 伊人久久大香线蕉亚洲五| 色播亚洲综合网| 中文字幕高清在线视频| 亚洲成av片中文字幕在线观看| www国产在线视频色| 成人av一区二区三区在线看| 99riav亚洲国产免费| 女人爽到高潮嗷嗷叫在线视频| 88av欧美| 国产精品永久免费网站| 欧美在线黄色| 久久热在线av| 久久久久久人人人人人| 国产色视频综合| 国产精品乱码一区二三区的特点| 欧美性猛交黑人性爽| 亚洲欧洲精品一区二区精品久久久| 女人高潮潮喷娇喘18禁视频| 操出白浆在线播放| 亚洲精品国产区一区二| 在线观看午夜福利视频| 久久久久国产精品人妻aⅴ院| 黄色视频不卡| 精品一区二区三区av网在线观看| 久久国产乱子伦精品免费另类| 两性夫妻黄色片| 超碰成人久久| 俄罗斯特黄特色一大片| 国产成人精品久久二区二区免费| 国产欧美日韩精品亚洲av| 熟女少妇亚洲综合色aaa.| 自线自在国产av| 18禁裸乳无遮挡免费网站照片 | 天堂动漫精品| 亚洲欧洲精品一区二区精品久久久| 亚洲成人精品中文字幕电影| 日韩一卡2卡3卡4卡2021年| 日本黄色视频三级网站网址| 免费人成视频x8x8入口观看| 欧美 亚洲 国产 日韩一| 黄色视频不卡| 日韩大尺度精品在线看网址| 亚洲激情在线av| 女人被狂操c到高潮| 老熟妇仑乱视频hdxx| 国产精品久久视频播放| 欧美精品亚洲一区二区| 69av精品久久久久久| 精品卡一卡二卡四卡免费| 久久久久久久久久黄片| 国内久久婷婷六月综合欲色啪| 人人妻人人澡欧美一区二区| 免费在线观看亚洲国产| 一区二区三区高清视频在线| 国产av又大| 在线观看66精品国产| 搡老妇女老女人老熟妇| 9191精品国产免费久久| 国产亚洲欧美在线一区二区| 中文字幕久久专区| 国内揄拍国产精品人妻在线 | 亚洲精品中文字幕在线视频| 久久久久九九精品影院| 国产精品1区2区在线观看.| 黄频高清免费视频| 搡老熟女国产l中国老女人| 黄色视频不卡| 成人欧美大片| 亚洲精品粉嫩美女一区| 99精品在免费线老司机午夜| 亚洲成av片中文字幕在线观看| 亚洲第一电影网av| 久久精品夜夜夜夜夜久久蜜豆 | 最新美女视频免费是黄的| 日韩 欧美 亚洲 中文字幕| 免费在线观看亚洲国产| 婷婷精品国产亚洲av在线| 久久香蕉激情| 色哟哟哟哟哟哟| 一本久久中文字幕| 黄色成人免费大全| 母亲3免费完整高清在线观看| 9191精品国产免费久久| 国产精品自产拍在线观看55亚洲| 精品卡一卡二卡四卡免费| 国产午夜精品久久久久久| 91大片在线观看| 久久婷婷人人爽人人干人人爱| 女人被狂操c到高潮| 人人妻,人人澡人人爽秒播| 欧美人与性动交α欧美精品济南到| 精品不卡国产一区二区三区| 1024视频免费在线观看| 国产精品免费视频内射| 午夜福利在线观看吧| 757午夜福利合集在线观看| 免费看美女性在线毛片视频| 成人永久免费在线观看视频| 欧美日韩乱码在线| 日本熟妇午夜| 狠狠狠狠99中文字幕| 91九色精品人成在线观看| 久久久久久国产a免费观看| 亚洲在线自拍视频| 亚洲久久久国产精品| 最新美女视频免费是黄的| 色综合亚洲欧美另类图片| 国产精品久久电影中文字幕| 黄色女人牲交| 黄色丝袜av网址大全| avwww免费| 精品久久蜜臀av无| 国产不卡一卡二| 亚洲第一欧美日韩一区二区三区| 巨乳人妻的诱惑在线观看| 在线av久久热| 在线视频色国产色| 老司机午夜福利在线观看视频| 最好的美女福利视频网| 国产精品亚洲av一区麻豆| 亚洲美女黄片视频| 成年女人毛片免费观看观看9| 欧美日本亚洲视频在线播放| 亚洲成国产人片在线观看| 免费看a级黄色片| 亚洲人成伊人成综合网2020| 亚洲美女黄片视频| 精品午夜福利视频在线观看一区| 亚洲片人在线观看| 国内久久婷婷六月综合欲色啪| 午夜福利一区二区在线看| 亚洲精品国产一区二区精华液| 啦啦啦免费观看视频1| 亚洲一区高清亚洲精品| 亚洲国产毛片av蜜桃av| 人妻久久中文字幕网| av中文乱码字幕在线| 国产成人一区二区三区免费视频网站| 人人妻,人人澡人人爽秒播| 在线看三级毛片| 国产爱豆传媒在线观看 | 中文字幕高清在线视频| 国产精品av久久久久免费| 91九色精品人成在线观看| 久久中文字幕一级| 亚洲专区字幕在线| 成人午夜高清在线视频 | 好男人在线观看高清免费视频 | 十分钟在线观看高清视频www| 最好的美女福利视频网| 99久久综合精品五月天人人| 国产主播在线观看一区二区| 国产aⅴ精品一区二区三区波| 美女高潮喷水抽搐中文字幕| 免费在线观看日本一区| 欧美日韩中文字幕国产精品一区二区三区| 黑人操中国人逼视频| 久久久精品欧美日韩精品| 好看av亚洲va欧美ⅴa在| 日本免费a在线| 搞女人的毛片| 啪啪无遮挡十八禁网站| 日韩中文字幕欧美一区二区| 精品欧美国产一区二区三| 麻豆久久精品国产亚洲av| 久久香蕉激情| av天堂在线播放| xxxwww97欧美| 啦啦啦免费观看视频1| 午夜久久久在线观看| 一区福利在线观看| 日韩欧美国产在线观看| 亚洲av成人av| 色播在线永久视频| 搡老妇女老女人老熟妇| 久热爱精品视频在线9| 午夜福利视频1000在线观看| 午夜福利高清视频| 国产视频内射| 男人舔女人下体高潮全视频| 亚洲欧洲精品一区二区精品久久久| 亚洲,欧美精品.| 九色国产91popny在线| 天堂√8在线中文| 中文字幕高清在线视频| 亚洲七黄色美女视频| 777久久人妻少妇嫩草av网站| 久久精品成人免费网站| 久久香蕉精品热| 1024手机看黄色片| 欧美又色又爽又黄视频| 少妇的丰满在线观看| 日韩 欧美 亚洲 中文字幕| 丰满人妻熟妇乱又伦精品不卡| 免费电影在线观看免费观看| 成人欧美大片| 国产伦人伦偷精品视频| 精品福利观看| 一本一本综合久久| 神马国产精品三级电影在线观看 | 国产精品亚洲一级av第二区| 制服诱惑二区| 国产男靠女视频免费网站| 国产单亲对白刺激| 国产欧美日韩一区二区三| 淫妇啪啪啪对白视频| 久久精品夜夜夜夜夜久久蜜豆 | 国内精品久久久久精免费| 侵犯人妻中文字幕一二三四区| 国产又爽黄色视频| 午夜精品在线福利| 好看av亚洲va欧美ⅴa在| 亚洲,欧美精品.| 国产久久久一区二区三区| 欧美又色又爽又黄视频| 麻豆成人午夜福利视频| 久久国产精品影院| 大型av网站在线播放| 两性午夜刺激爽爽歪歪视频在线观看 | 欧美黄色淫秽网站| 韩国精品一区二区三区| 精品欧美国产一区二区三| 久久香蕉精品热| a级毛片在线看网站| 色老头精品视频在线观看| 久久人妻福利社区极品人妻图片| 色播在线永久视频| 亚洲九九香蕉| 久热爱精品视频在线9| 国产熟女午夜一区二区三区| 波多野结衣高清无吗| 狂野欧美激情性xxxx| 男女做爰动态图高潮gif福利片| 精品久久久久久久久久免费视频| 久久久精品欧美日韩精品| 亚洲色图av天堂| 好男人在线观看高清免费视频 | 高清在线国产一区| 国产亚洲欧美在线一区二区| 757午夜福利合集在线观看| 在线观看免费日韩欧美大片| 亚洲真实伦在线观看| 亚洲男人的天堂狠狠| 男女床上黄色一级片免费看| 国产伦在线观看视频一区| 久久精品91无色码中文字幕| 大型黄色视频在线免费观看| 成熟少妇高潮喷水视频| 亚洲精品美女久久av网站| 国产伦在线观看视频一区| 麻豆av在线久日| 亚洲专区中文字幕在线| 成人亚洲精品av一区二区| 天堂动漫精品| 欧美绝顶高潮抽搐喷水| 亚洲五月婷婷丁香| av中文乱码字幕在线| 丰满人妻熟妇乱又伦精品不卡| 久久香蕉国产精品| 少妇熟女aⅴ在线视频| 亚洲五月婷婷丁香| 国产三级在线视频| 最新美女视频免费是黄的| 亚洲人成77777在线视频| 成年版毛片免费区| 国产熟女午夜一区二区三区| 巨乳人妻的诱惑在线观看| 国产成人av激情在线播放| 嫩草影院精品99| 国产精品一区二区精品视频观看| 国产熟女午夜一区二区三区| 欧美在线黄色| 99精品在免费线老司机午夜| 大型av网站在线播放| 亚洲国产欧美日韩在线播放| 亚洲第一欧美日韩一区二区三区| 中文字幕人妻丝袜一区二区| 搞女人的毛片| 真人一进一出gif抽搐免费| 午夜福利18| 欧洲精品卡2卡3卡4卡5卡区| 欧美日韩中文字幕国产精品一区二区三区| 午夜激情av网站| 女警被强在线播放| 男女下面进入的视频免费午夜 | 久久中文看片网| 欧美乱码精品一区二区三区| 黄片大片在线免费观看| 精品久久久久久,| 欧美大码av| 欧美性猛交╳xxx乱大交人| 亚洲欧美精品综合一区二区三区| 99久久国产精品久久久| 男女做爰动态图高潮gif福利片| 99国产精品99久久久久| 国产精品久久视频播放| 亚洲精品在线观看二区| 国产aⅴ精品一区二区三区波| 99国产综合亚洲精品| 女同久久另类99精品国产91| 亚洲欧美激情综合另类| 国产一卡二卡三卡精品| 国产伦一二天堂av在线观看| 黄片大片在线免费观看| 天天躁狠狠躁夜夜躁狠狠躁| 亚洲在线自拍视频| 欧美成人性av电影在线观看| 久久久久久九九精品二区国产 | 久久香蕉精品热| 国产区一区二久久| 亚洲激情在线av| 国产伦在线观看视频一区| 欧美乱妇无乱码| 99re在线观看精品视频| 久久国产亚洲av麻豆专区| 国产精品 国内视频| 嫩草影视91久久| 天堂影院成人在线观看| 久久中文字幕一级| 亚洲精品中文字幕在线视频| 欧美日韩亚洲国产一区二区在线观看| 色综合站精品国产| 国产亚洲av高清不卡| 欧美丝袜亚洲另类 | 国产在线观看jvid| 窝窝影院91人妻| 色精品久久人妻99蜜桃| 人人妻,人人澡人人爽秒播| 国产精品99久久99久久久不卡| 日本一区二区免费在线视频| 欧美午夜高清在线| 中文字幕精品免费在线观看视频| 国产熟女午夜一区二区三区| 欧美黑人精品巨大| 色播在线永久视频| 亚洲成av人片免费观看| 国产又爽黄色视频| 波多野结衣巨乳人妻| 国产精品久久视频播放| 久久人人精品亚洲av| 夜夜躁狠狠躁天天躁| 免费搜索国产男女视频| 此物有八面人人有两片| 成人18禁在线播放| 中文字幕另类日韩欧美亚洲嫩草| 亚洲精品国产区一区二| 欧美中文综合在线视频| 精品久久久久久久毛片微露脸| 成人特级黄色片久久久久久久| 国产成人欧美在线观看| 女生性感内裤真人,穿戴方法视频| 又紧又爽又黄一区二区| 亚洲成国产人片在线观看| 亚洲精品国产精品久久久不卡| 精品少妇一区二区三区视频日本电影| 欧美丝袜亚洲另类 | 日韩欧美三级三区| 中文资源天堂在线| 成人手机av| avwww免费| 搡老熟女国产l中国老女人| 99在线视频只有这里精品首页| 男人舔女人下体高潮全视频| 9191精品国产免费久久| 国产欧美日韩一区二区精品| 国产精品久久久人人做人人爽| 亚洲一区二区三区不卡视频| 欧美日韩福利视频一区二区| 国产欧美日韩一区二区三| 天天一区二区日本电影三级| x7x7x7水蜜桃|